Troubleshooting 500 internal server in kubernetes

9/20/2021

I have an application using Azure Kubernetes. Everything was working fine and the API gave me 200 response all the time, but last week I started receiving 500 internal server errors from the API management, and it indicated that its a backend error. I ran the server locally and sent requests to the API and it worked, so I figured the problem happens somewhere in Azure Kubernetes.

However the logs were super cryptic and didn't add that much info so I never really found out what was the problem. I just ran my code to deploy the image again and it got fixed but there was no way to realize that's the problem.

This time I managed to fix the problem but I'm looking for a better way to troubleshoot 500 internal server error in Azure. I have looked all through the Azure documentation but haven't found anything other than the logs, which weren't really helpful in my case. How do you usually go about troubleshooting 500 errors in applications running in Kubernetes?

-- Wiz
api
azure
kubernetes

1 Answer

9/20/2021

In general, it all depends specifically on the situation you are dealing with. Nevertheless, you should always start by looking at the logs (application event logs and server logs). Try to look for information about the error in them. Error 500 is actually the effect, not the cause. If you want to find out what may have caused the error, you need to look for this information in the logs. Often times, you can tell what went wrong and fix the problem right away.

If you want to reproduce the problem, check the comment of David Maze:

I generally try to figure out what triggers the error, reproduce it in a local environment (not Kubernetes, not Docker, no containers at all), debug, write a regression test, fix the bug, get a code review, redeploy. That process isn't especially unique to Kubernetes; it's the same way I'd debug an error in a customer environment where I don't have direct access to the remote systems, or in a production environment where I don't want to risk breaking things further.

See also:

-- Mikołaj Głodziak
Source: StackOverflow