Hi I am managing a Kubernetes cluster that has been created with Kops. I recently updated Kubernetes to 1.9.9 and executed a rolling-update to the cluster. All seems to have gone well, except a new annoying problem arised:
My Kubernetes api server becomes unavailable in a very sporadic unexpected manner. Sometimes it becomes unaccessible for a couple of minutes, sometimes for half an hour. And then if you try again it miraculously works again. This makes it quite hard for me to work with the Kubernetes cluster. Does anyone know why these error messages keep popping up?
Unable to connect to the server: net/http: TLS handshake timeoutand
The connection to the server api.xxx.io was refused - did you specify the right host or port?I posted this same question on kubernets/kops github repository when @justinsb pointed me into the right direction. He advised to ssh into the master node and do a docker ps to see if all containers were stable.
What I figured doing this is that my apiserver container was constantly crashing. That's what led me to the conclusion that my master node potentially didn't have enough compute power to handle all the requests. I changed the master ec2 instance from t2.micro to t2.medium and the problem disappeared.