Kubernetes service discovery fails rarely

3/26/2016

I'm using google compute to have a kubernetes cluster setup. It's running all fine without any issues. But sometimes; not very often, kubernetes service discovery fails. Structurally I'm using replication controller and k8 service to distribute the load.

I verified that none of the node in the cluster was restarted. Does anyone have any opinion on this ? Also to avoid such rare scenario what should be the best practice ?

-- Phagun Baya
kubernetes

1 Answer

3/30/2016

Your problem likely occurs when a pod is starting up or terminating and traffic is being directed to a container. You should use a livelinessProbe to health check the container. You can do health checks 3 ways: HTTP GET, Socket open, and running a command. HTTP GETs and running a command needs to return a success status. For sockets, if they can be opened then the probe is considered successful.

apiVersion: v1
kind: ReplicationController
metadata:
  name: my-nginx
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
        livenessProbe:
          httpGet:
            # Path to probe; should be cheap, but representative of typical behavior
            path: /index.html
            port: 80
          initialDelaySeconds: 30
          timeoutSeconds: 1
-- Ian Lewis
Source: StackOverflow