Troubleshooting kubernetes removed pod

12/25/2018

I have a problem with spark application on kuberenetes. Spark driver tries to create an executor pod and executor pod fails to start. The problem is that as soon as the pod fails, spark driver removes it and creates a new one. The new one fails dues to the same reason. So, how can i recover logs from already removed pods as it seems like default spark behavior on kubernetes. Also, i am not able to catch the pods since the removal is instantaneous. I have to wonder how i am ever supposed to fix the failing pod issue if i cannot recover the errors.

-- Alex Pryiomka
apache-spark
kubernetes

1 Answer

12/27/2018

In your case it would be helpful to implement cluster logging. Even if the pod gets restarted or deleted, its logs will stay in a log aggregator storage.

There are more than one solution to the cluster logging, but most popular is EFK (Elasticsearch, Fluentd, Kibana).

Actually, you can go even without Elasticsearch and Kibana.
Check out an excellent article Application Logging in Kubernetes with fluentd by Rosemary Wang that explains how to configure fluentd to put aggregated logs to fluentd pod stdout and access it later using the command:

kubectl logs <fluentd pod>
-- VAS
Source: StackOverflow