How to unstuck a Pod from "Pending" state caused by errors happen in image pulling stage

3/12/2019

I have a program that launches a job using Job Controller. Due to some ops configuration errors the Pod is not able to pull the image from registry correctly.

I initially expect the Pod's status will go to "Failed", but turns out it stucks in "Pending" state forever. I initially thought I might be able to implement livenessProbe in the main container and set the timeout there, but since my init container is still waiting: ImagePullBackOff, the probe won't even start.

I wonder without inspection into the pod, is there a way to notify the Pod to fail when it fails to pull the image, e.g. setting a timeout or retry limit.

--
kubernetes

0 Answers