Recovering/ retrying in case of failed or stucked HTTP requests

3/11/2019

I have a Java-based server managed by the kubernetes cluster. It's a distributed environment where the number of the instance is set to 4 to handle millions of request per minute.

The issue that I am facing is kubernetes tries to balance the cluster and in the process kills the pod and take it to another node, but there are pending HTTP request GET and POST that gets lost.

What is the solution by kubernetes or architectural solution that would let me retry if the request is stuck/ failed?

UPDATE:

I have two configurations for kubernetes service:

LoadBalancer (is with AWS ELB): for external facing
ClusterIP: for internal microservice based architecture

-- Vishrant

grizzly

jakarta-ee

java

kubernetes

server

2 Answers

3/11/2019

Kubernetes gives you the means to gracefully handle pod terminations via SIGTERM and preStop hooks. There are several articles on this, e.g. Graceful shutdown of pods with Kubernetes. In your Java app, you should listen for SIGTERM and gracefully shutdown the server (most http frameworks have this "shutdown" functionality built in them).

The issue that I am facing is kubernetes tries to balance the cluster and in the process kills the pod and take it to another node

Now this sounds a little suspicious - in general K8s only evicts and reschedules pods on different nodes under specific circumstances, for example when a node is running out of resources to serve the pod. If your pods are frequently getting rescheduled, this is generally a sign that something else is happening, so you should probably determine the root cause (if you have resource limits set in your deployment spec make sure your service container isn't exceeding those - this is a common problem with JVM containers).

Finally, HTTP retries are inherently unsafe for non-idempotent requests (POST/PUT), so you can't just retry on any failed request without knowing the logical implications. In any case, retries generally happen on the client side, not server, so it's not a flag you can set in K8s to enable them.

-- PoweredByOrange

Source: StackOverflow

3/12/2019

Service mesh solves the particular issue that you are facing.

There are different service mesh available. General features of service mesh are

Load balancing
Fine-grained traffic policies
Service discovery
Service monitoring
Tracing
Routing

Service Mesh

Istio
Envoy
Linkerd

Linkerd: https://linkerd.io/2/features/retries-and-timeouts/

-- Shashank Sinha

Source: StackOverflow

K
Q

Recovering/ retrying in case of failed or stucked HTTP requests

Similar Questions

2 Answers

KQ

Recovering/ retrying in case of failed or stucked HTTP requests

Similar Questions

2 Answers

K
Q