I am trying write a bash
script to troubleshoot a kubernetes cluster
I have a kubernetes cluster
with one master node
and few minions
.
I am trying to write a script to troubleshoot the cluster which finally outputs a detailed report with errors(if existed) and successes
Q.) What I really need to know is, what steps should I follow and what components should I be testing/checking during the troubleshoot. I need a list of procedures, steps (maybe in bullet form) on how to troubleshoot a Kubernetes Cluster
manually
PS: I don't want to use kubernetes
in-built testing mechanism, I need it to be manually tested/troubleshooted.
Any one here can give me a good descriptive mechanism/steps?
You need to check below services on master to confirm that kubernetes is functioning fine
docker should be running
kubelet should be running ( if you run control-plane components in containers )
etcd
kubernetes scheduler
kubernetes controller manager
kubernetes components health ( kubectl get cs
)
all services in kube-system should be running ( kubectl get pods -n kube-system
)