I have two servers. First one gets the request from client and makes another http request to second server. When the request for the second server have a response, first server does some other stuff and responds to the client. I don't want client to wait, so I have set a timeout of 5 seconds for the request between the servers.
When I check the latency of the second server, the 99th percentile total latency is 124ms and there is no spike. Also the memory usage and cpu usage are at around half of the limits. I am assuming the second server is healthy.
The problem is my first server logs that the request between servers times out every second. Is it because of a limitation of connections or something? I could not figure out why I get those timeouts.
My servers are running on Kubernetes containers, both are nodeJS servers. I am using Axios for the request.
If it is related to connection limits, is having more pods an answer?
The throughput for both is around 2k request/min.