Basic ContainerCreating Failure

9/2/2018

Occasionally I see problems where creating my deployments takes a much longer time than usual (this one is typically a minute or two). How do people normally deal with this? Is it best to remove the offending node? What's the right way to debug this?

error: deployment "hillcity-twitter-staging-deployment" exceeded its progress deadline                                                                                                                                                                                                                           
Waiting for rollout to complete (been 500s)...                                                                                                                                                                                                                                                                   
NAME                                                   READY     STATUS              RESTARTS   AGE       IP             NODE                                                                                                                                                                                    
hillcity-twitter-staging-deployment-5bf6b48779-5jvgv   2/2       Running             0          8m        10.168.41.12   gke-charles-test-cluster-default-pool-be943055-mq4j                                                                                                                                     
hillcity-twitter-staging-deployment-5bf6b48779-knzkw   2/2       Running             0          8m        10.168.34.34   gke-charles-test-cluster-default-pool-be943055-czqr
hillcity-twitter-staging-deployment-5bf6b48779-qxmg8   0/2       ContainerCreating   0          8m        <none>         gke-charles-test-cluster-default-pool-be943055-rzg2                                                                                                                                     

I've ssh-ed into the "rzg2" node but didn't see anything particularly wrong with it. Here's the k8s view:

kubectl top nodes                                                                                                                                                                                                                                       
NAME                                                  CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%                                                                                                                                                                                                             
gke-charles-test-cluster-default-pool-be943055-2q9f   385m         40%       2288Mi          86%       
gke-charles-test-cluster-default-pool-be943055-35fl   214m         22%       2030Mi          76%                                                                                                                                                                                                                 
gke-charles-test-cluster-default-pool-be943055-3p95   328m         34%       2108Mi          79%                                                                                                                                                                                                                 
gke-charles-test-cluster-default-pool-be943055-67h0   204m         21%       1783Mi          67%                                                                                                                                                                                                                 
gke-charles-test-cluster-default-pool-be943055-czqr   342m         36%       2397Mi          90%                                                                                                                                                                                                                 
gke-charles-test-cluster-default-pool-be943055-jz8v   149m         15%       2299Mi          86%                                                                                                                                                                                                                 
gke-charles-test-cluster-default-pool-be943055-kl9r   246m         26%       1796Mi          67%                                                                                                                                                                                                                 
gke-charles-test-cluster-default-pool-be943055-mq4j   123m         13%       1523Mi          57%                                                                                                                                                                                                                 
gke-charles-test-cluster-default-pool-be943055-mx18   276m         29%       1755Mi          66%                                                                                                                                                                                                                 
gke-charles-test-cluster-default-pool-be943055-pb48   200m         21%       1667Mi          63%                                                                                                                                                                                                                 
gke-charles-test-cluster-default-pool-be943055-rzg2   392m         41%       2270Mi          85%                                                                                                                                                                                                                 
gke-charles-test-cluster-default-pool-be943055-wkxk   274m         29%       1954Mi          73%                                                                                                                                                                                                                 

```

Added: Here's some of the output of "$ sudo journalctl -u kubelet"

Sep 04 22:14:11 gke-charles-test-cluster-default-pool-be943055-rzg2 kubelet[1442]: E0904 22:14:11.882166    1442 fsHandler.go:121] failed to collect filesystem stats - rootDiskErr: du command failed on /var/lib/docker/overlay/83ed56fdfae736d5b1bd3afc3649555916a2ef24a287415256a408c463186107 with output stdout: , stderr:  - signal: killed, rootInodeErr: <nil>, extraDiskErr: <nil>
[...repeated a lot...]
Sep 04 22:25:19 gke-charles-test-cluster-default-pool-be943055-rzg2 kubelet[1442]: E0904 22:25:19.917177    1442 kube_docker_client.go:324] Cancel pulling image "gcr.io/able-store-864/hillcity-worker:0.0.1" because of no progress for 1m0s, latest progress: "43f9fd4bd389: Extracting [=====>                                             ] 32.77 kB/295.9 kB"
-- Charles Thayer
google-kubernetes-engine
kubernetes

0 Answers