K8s rolling-update does not gracefully switch pods in load balancer

4/22/2017

When changing the image version of the web-deployment and triggering a rolling-update in a deployment with 1 replica an additional pod with the new version is created and running. But still the live pod is terminated before the load balancer seems to switch to the new pod.

Running curl http://<Load Balancer Ingress IP>/mynode.test?[1-2000] shows errors during the switching process.

I tried a few things which might have spoiled up the timing.

These are my yaml files and the gcloud commands:

gcloud container \
clusters create "sling-cluster-small" \
--project "devops-test" \
--zone "us-central1-a" \
--machine-type "g1-small" \
--num-nodes "3" \
--network "default"

gcloud compute disks create \
  --project "devops-test" \
  --zone "us-central1-a" \
  --size 200GB \
  mongo-disk

db-deployment.yml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: mongo-deployment
spec:
  replicas: 1
  template:
    metadata:
      labels:
        name: mongo
    spec:
      containers:
      - image: mongo
        name: mongo
        ports:
        - name: mongo
          containerPort: 27017
          hostPort: 27017
        volumeMounts:
          - name: mongo-persistent-storage
            mountPath: /data/db
      volumes:
        - name: mongo-persistent-storage
          gcePersistentDisk:
            pdName: mongo-disk
            fsType: ext4

db-service.yml:

apiVersion: v1
kind: Service
metadata:
 labels:
   name: mongo
 name: mongo
spec:
 ports:
   - port: 27017
     targetPort: 27017
 selector:
   name: mongo

web-deployment.yml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: web-deployment
spec:
  replicas: 1
#  minReadySeconds: 60
  strategy:
    rollingUpdate:
      maxUnavailable: 0
  template:
    metadata:
      labels:
        name: web
    spec:
      containers:
# 0.1.2 and 0.2.2 are the two valid versions
      - image: sandroboehme/sample-app:0.1.2
#      - image: sandroboehme/sample-app:0.2.2
        name: web
        ports:
        - name: http-server
          containerPort: 8080
        readinessProbe:
          httpGet:
            path: /mynode.test
            port: 8080
          initialDelaySeconds: 90
          timeoutSeconds: 30
          periodSeconds: 5
        lifecycle:
          preStop:
            exec:
              # A workaround to get 0% downtime during deploys. This should
              command: ["sh", "-c", "curl http://localhost:8080/mynode.test?ready=false -u admin:admin && sleep 15"]

web-service.yml:

apiVersion: v1
kind: Service
metadata:
 name: web
 labels:
   name: web
spec:
# type: NodePort
 type: LoadBalancer
 ports:
   - port: 80
     targetPort: 8080
     protocol: TCP
#     nodePort: 30000
 selector:
   name: web

How can one handle that? Any help is much appreciated! Thanks for your time looking into this!

-- Sandro
google-kubernetes-engine
kubernetes

0 Answers