Practical consequences of missing consensus on a Kubernetes cluster?

11/27/2019

What exactly are the practical consequences of missing consensus on a Kubernetes cluster? Or in other words: which functions on a Kubernetes cluster require consensus? What will work, what won't work?

For example (and really only for example):

  • will existing pods keep running?
  • can pods still be scaled horizontally?

Example scenario: A cluster with two nodes loses one node. No consensus possible.

-- stefan.at.wpf
consensus
etcd
google-kubernetes-engine
kubernetes

1 Answer

11/27/2019

Consensus is fundamental to etcd - the distributed database that Kubernetes is built upon. Without consensus you can read but not write from the database. E.g. if only 1 of 3 nodes is available.

When you lose quorum etcd goes into a read only state where it can respond with data, but no new actions can take place since it will be unable to decide if the action is allowed.

Understanding Etcd Consensus and How to Recover from Failure

Kubernetes is designed so pods only need kubernetes for changes, e.g. deployment. After that they run independent of kubernetes in a loosely coupled fashion.

Kubernetes is contstructed for keeping desired state in the etcd database. Then controllers watch etcd for changes and act upon change. This means that you can not scale or change any configuration of pods if etcd doesn't have consensus. Kubernetes does many self-healing operations, but they will not work if etcd is not available since all operations is done through the ApiServer and etcd.

Loosing quorum means that no new actions can take place. Everything that is running will continue to run until there is a failure.

Understanding Distributed Consensus in etcd and Kubernetes

-- Jonas
Source: StackOverflow