Google Kubernetes Engine - node upgrade procedure

5/2/2019

What are the steps followed by Google when doing a node upgrade or maintenance on a node? I've been assuming it was:

  • Drain node
  • Perform any operation
  • Bring node up again

or

  • Drain node
  • Delete node
  • Bring new node up

But recently during a node upgrade operation, some of our pods that weren't replicated died and there was a downtime for a couple of minutes. Later checking the age of the new pod it matched the age of the node. I was also able to see that the node where it was deployed changed at the time of the node upgrade.

So, does anybody knows what is the procedure that Google follows in order to make a node upgrade?

-- r1ckr
google-cloud-platform
google-kubernetes-engine
kubernetes

1 Answer

5/2/2019

Check out - https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-upgrading-your-clusters-with-zero-downtime

Rolling updates are the default (other option is Node Pool migration)

"A rolling update works in the following way. One by one, a node is drained and cordoned so that there are no more pods running on that node. Then the node is deleted, and a new node is created with the updated Kubernetes version. Once that node is up and running, the next node is updated. This goes on until all nodes are updated.

You can let Kubernetes Engine manage this process for you completely by enabling automatic node upgrades on the node pool. One drawback is that you get one less node of capacity in your cluster. This issue is easily solved by scaling up your node pool to add extra capacity, and then scaling it back down once the upgrade is finished. The fully automated nature of the rolling update makes it easy to do, but you have less control over the process. It also takes time to roll back to the old version if there is a problem, as you have to stop the rolling update and then undo it."

-- user2995678
Source: StackOverflow