I have setup the kubernetes with one master and two Workers, but I am facing one issue.
I have created the apache pod; it will deployed on worker1 automatically by the scheduler. It works fine. When I stop the worker one machine, ideally pod will be generated on worker2. The problem is that it takes around 7 minutes to come online on workers2.
Is there any way to fail the pod over without any downtime?
There will be a minor downtime unless you have multiple replicas (apache replicas) and have a Kubernetes service forwarding to them on your system. This is generally the architecture is recommended for HTTP/TCP type of services.
However, if you need faster response you could tweak:
--node-status-update-frequency
on the kubelet. (Default 10 seconds)--kubelet-timeout
on the kube-apiserver. Which defaults to a low 5 seconds.–-node-monitor-period
on the kube-controller-manager. Defaults to 5 seconds.-–node-monitor-grace-period
on the kube-controller-manager. Defaults to 40 seconds.-–pod-eviction-timeout
on the kube-controller-manager. Defaults to 5 minutes.You can try something like this:
--node-status-update-frequency=4s
(from 10s)--node-monitor-period=2s
(from 5s)--node-monitor-grace-period=16s
(from 40s)--pod-eviction-timeout=30s
(from 5m)