I am trying to achieve zero downtime deployment process, but it is not working.
My deployment has one replica. The pod probes look like this:
livenessProbe:
httpGet:
path: /health/live
port: 80
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /health/ready
port: 80
initialDelaySeconds: 15
periodSeconds: 20During deployment, accessing pod returns 503 for at least 10 seconds. Questions I have:
Running describe on the pod I get:
Liveness: http-get http://:80/health/live delay=5s timeout=1s period=2s #success=1 #failure=3
Readiness: http-get http://:80/health/ready delay=5s timeout=1s period=2s #success=1 #failure=3You need to use the RollingUpdate strategy in your Deployment in addition to probes :
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 1Interesting global example here
The problem was in
kind: Service
spec:
type: ClusterIP
selector:
app: maintenance-api
version: "1.0.0"
stage: #{Release.EnvironmentName}#
release: #{Release.ReleaseName}#if selector is sth like #{Release.ReleaseName}# which changes every release then its like the old pod cannot be found so when release starts service is disconnecting from pod and only after the new pod finish deploying the service will start redirecting to it.