Enforcing Ingress to return 503 during maintenance

1/15/2019

We have some of our API services running on Google Kubernetes Engine, and from time to time we need to make some maintenance, so we want the API service to return 503 together with some configurable message about the downtime.

It would not be a reliable way to make the API service return 503 from the Kubernetes deployments the service links to, as the API pods might need to be taken down/restarted.

One idea I had was to have a specific deployment/pod that we would configure the service to use, and that that service just returns 503 with some information about the Service maintenance. However, this approach would not hold if we would do cluster upgrade, as there might be some time when the deployment/pod also would be unavailable.

So is there some way to do this without having to rely on a deployment/pod? Meaning a configuration that is outside the scope of the specific Kubernetes cluster?

-- pjotr_dolphin
google-kubernetes-engine
kubernetes
kubernetes-ingress

2 Answers

1/18/2019

You can use an ingress and 2 deployments (the api and the 503) and whenever you want to switch, you just edit the ingress .

kubectl edit ingress ${INGRESS_NAME}

In GKE, you can achieve a zero downtime upgrade if you use a regional cluster (3 masters), at least 2 node pools and node affinity to nodes on different node pools so that whenever you update just a node pool using the --node-pool flag, there is a pod running on the other node pool.

You can also refer to the Kubernetes best practices document to read more about a zero downtime upgrade.

-- ozrlz
Source: StackOverflow

1/15/2019

If you set this configuration in the Kubernetes level, it'll never hold during a cluster upgrade. You have to rely on a external solution, like, to host a simple function on Cloud Fuctions. You, also, can achieve it with AWS Cloud Formation for example.

The switch between the health application and the maintenance message you can do on your DNS but I'd not rely on it since it can take some time based on your TTL config. I'd prefer to do it on your LoadBalancer solution.

-- Jonathan Beber
Source: StackOverflow