Whole Kubernetes cluster down after master crash

7/10/2019

Misconfigured master with left unattended updates on restarted Docker service. Master went down and all nodes got inaccessible using NodePort. After master restart everything came back online. Can't find the reason in node logs. What could cause this?

-- Jonas
kubernetes

1 Answer

7/10/2019

It could survive, but taking gke documentation https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-architecture#master_node (I took this example because I mostly use K8s on gke) you can see that the master is responsible for the nodes network and that explains your issue, if the nodes tries to get informations from the master they won't have a response and your cluster will collapse.

That's also why they advise to use a high availability cluster for production, for cases in wich your master is being updated you could have some issues accessing your cluster...

-- night-gold
Source: StackOverflow