I'm trying to create a deployment along with a service and then access the service immediately as soon as the rollout is complete:
> kubectl create -f my-deployment.yaml
> kubectl create -f my-service.yaml
> kubectl rollout status deployment/my-deployment --watch --timeout 10m # This usually takes ~30 seconds
deployment "my-deployment" successfully rolled out
> curl "my-service" # This happens inside a pod, so the service DNS name should be available
Sometimes this works, but there seems to be a race condition -- if the curl
command happens too quickly, it seems the socket fails to connect and I get a connection timeout.
This seems like the behavior I would get if there were no ready pods, as per this question: What happens when a service receives a request but has no ready pods?
I expected that the completion of the rollout meant that the service was guaranteed to be ready to go. Is this not the case? Is there some Kubernetes command to "wait" for the service to be available? (I notice that services don't have conditions, so you can't do kubectl wait
...)
My answer is slightly different from what you're asking but probably it will be useful to you. I propose using Helm to improve your deployment experience.
How can it help you?
Helm has several flags like --wait that can be applied during the update. It will make sure all the resources were successfully created before moving forward.
To know if a service is ready you can check if an endpoints object exists with the name of the service and if that endpoint object has IPs or not. If IPs are there that means service is ready. But there is no guarantee that it will still not fail because there can be network issue in your infrastructure.
K8s primitives that manage pods, such as Deployment, only takes pod status into account for decision making, such as advancement during rolling update.
For example, during deployment rolling update, a new pod becomes ready. On the other hand, service, network policy and load-balancer are not yet ready for the new pod due to whatever reason (e.g. slowness in api machinery, endpoints controller, kube-proxy, iptables or infrastructure programming). This may cause service disruption or lost of backend capacity. In extreme cases, if rolling update completes before any new replacement pod actually start serving traffic, this will cause service outage.
Here is the proposal to improve pod readiness which is motivated by above problem.