How to run a script when a node goes down on Kubernetes?

10/23/2018

I have a question about kepping a Kubernetes cluster online as much as possible. Usually, the cluster would be behind some kind of cloud load-balancer which does health-checks and directs traffic to the available nodes.

Now, my hosting provider does not offer managed load balancers. Instead, they have a configurable so-called "failover" IP which can be detached and re-attached to another server by running a command in the command line. it's not a real failover IP in the traditional sense. More like a movable IP.

As a beginner to Kubernetes, I'm not sure how to go about this.

Basically, I'd need to run a script that checks if the cluster is still publically online on the IP. When it goes down, one of the nodes should run the script to detach and re-attach the failover IP to itself or one of the other nodes.

Extra complication: The moving of the failover IP takes around 40-60 seconds to take effect, so we should not run the script too often.

This also means that only 1 node is attached to the public IP and all traffic to the cluster will come in this way. There is no load balancer distributing traffic among the online nodes. Will Kubernetes send the request on its own to the other nodes internally? I imagine so?

The cluster consists of 3 identical servers with 1 master and 2 other workers. I'd setup load balancing in the cluster with Ingress.

The goal is to keep websites running on k8s up as much as possible, while working with the limited options our hosting company offers. The hosting company only offers dedicated bare-metal machines and this movable IP. They don't have managed load balancers like AWS or DigitalOcean have, so I need to find a solution for that. This moveable IP looks like an answer, but if you know a better way, then sure.

All 3 machines have a public IP and a private IP. There is 1 extra public IP that can be moved to 1 of the 3 nodes (using this I want to achieve failover, unless you know a better way?).

Personally, I don't think I need a multi-master cluster. As I understand it, the master can go down for short periods of time, and during those periods the cluster is more vulnerable, but this is okay as long as we can timely fix the master. Only thing is, that we need to move this IP over to an online node, and I'm not sure how to trigger this.

Thanks

-- Wouter
failover
kubernetes

0 Answers