When loading a Pod with a container that has many/large layers, it can take more than 2 minutes on my cluster's machines (slower single thread performance coupled with 7200rpm spinning rust means slow untar/ungzip speeds).
This means Kubernetes will give up on that container, saying "context deadline exceeded", then retry. Allowed to run overnight (on accident), it will run out of disk as the attempts pile up more and more.
Example pod:
apiVersion: v1
kind: Pod
metadata:
name: test-large-container-1
spec:
containers:
- name: X
image: X:latest
stdin: true
tty: true
command: ["bash"]
Is there a field in the PodSpec I missed or a configuration for kubelet itself?
Events seen:
2018-04-10 13:01:22 -0700 PDT 2018-04-10 13:01:22 -0700 PDT 1 test-large-container-1.15242b927c24ec40 Pod Normal Scheduled default-scheduler Successfully assigned test-large-container-1 to node1
2018-04-10 13:01:29 -0700 PDT 2018-04-10 13:01:29 -0700 PDT 1 test-large-container-1.15242b942c41e77f Pod spec.initContainers{map} Normal Pulling kubelet, node1 pulling image "X:latest"
2018-04-10 13:01:30 -0700 PDT 2018-04-10 13:01:30 -0700 PDT 1 test-large-container-1.15242b948764b21a Pod spec.initContainers{map} Normal Pulled kubelet, node1 Successfully pulled image "X:latest"
2018-04-10 13:03:30 -0700 PDT 2018-04-10 13:03:30 -0700 PDT 1 test-large-container-1.15242bb0780e06ee Pod spec.initContainers{map} Warning Failed kubelet, node1 Error: context deadline exceeded
I think initContainer:
s run before the primary container:
s are even docker pull
-ed, so it may be worth trying docker:latest
, volume mount the host's /var/run/docker.sock
, and then use the initContainer
to pull the image
Thanks to bits! It was the --runtime-request-timeout
flag that I needed to change. Once I increased it enough, it started working!