Long waiting time after SuccessfulMountVolume and before Pulling while creating Kubernetes Deployment

4/19/2018

I have a 32-node (virtual machine) k8s cluster and I found some performance issue while creating a simple redis-cache deployment with 3 or more replicas. The first POD was always created very fast and entered into RUNNING status within a few seconds. However, the creation of the rest of the replicas took more than 1 minutes. I looked at the POD event logs and looks like the creation procedure waited after the Schedule and SuccessfulMountVolume stages completed but before "Pulling". So there is a big time gap between "SuccessfulMountVolume" and "Pulling". Anybody knows what's going on before "pulling" image from repo? What does kubernetes do during that time and how to debug this kind of issue?

I am using version 1.9.2 and below is my yaml file:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: redis-cache
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: store
    spec:
      containers:
      - name: redis-server
        image: redis:3.2-alpine

Thanks,

-- Yong Li
kubernetes

2 Answers

5/23/2018

Following Anton's answer, I found out that the network plugin (cni) caused the issue for me. I figured this out by looking at the kubelet log while creating my PODs with replicas. The flow of creating POD can be clearly seen from the log:

I0424 18:32:19.059144   20319 config.go:405] Receiving a new pod "redis-cache-7f9b8ddc49-jzh5l_default(d1f95f27-47ed-11e8-9fd0-005056a1ac67)"
..........
I0424 18:32:19.363560   20319 volume_manager.go:371] All volumes are attached and mounted for pod "redis-cache-7f9b8ddc49-jzh5l_default(d1f95f27-47ed-11e8-9fd0-005056a1ac67)"
I0424 18:32:19.363581   20319 kuberuntime_manager.go:385] No sandbox for pod "redis-cache-7f9b8ddc49-jzh5l_default(d1f95f27-47ed-11e8-9fd0-005056a1ac67)" can be found. Need to start a new one
I0424 18:32:19.363597   20319 kuberuntime_manager.go:571] computePodActions got {KillPod:true CreateSandbox:true SandboxID: Attempt:0 NextInitContainerToStart:nil ContainersToStart:[0] ContainersToKill:map[]} for pod "redis-cache-7f9b8ddc49-jzh5l_default(d1f95f27-47ed-11e8-9fd0-005056a1ac67)"
I0424 18:32:19.363628   20319 kuberuntime_manager.go:580] SyncPod received new pod "redis-cache-7f9b8ddc49-jzh5l_default(d1f95f27-47ed-11e8-9fd0-005056a1ac67)", will create a sandbox for it
I0424 18:32:19.363641   20319 kuberuntime_manager.go:589] Stopping PodSandbox for "redis-cache-7f9b8ddc49-jzh5l_default(d1f95f27-47ed-11e8-9fd0-005056a1ac67)", will start new one
I0424 18:32:19.363655   20319 kuberuntime_manager.go:641] Creating sandbox for pod "redis-cache-7f9b8ddc49-jzh5l_default(d1f95f27-47ed-11e8-9fd0-005056a1ac67)"
I0424 18:32:19.365101   20319 docker_service.go:441] Setting cgroup parent to: "/kubepods/besteffort/podd1f95f27-47ed-11e8-9fd0-005056a1ac67"
I0424 18:32:19.533516   20319 factory.go:112] Using factory "docker" for container "/kubepods/besteffort/podd1f95f27-47ed-11e8-9fd0-005056a1ac67/69cae77dd85e0c66b9c4374fffc04570556404a89146351252b5c4279084d925"
I0424 18:32:19.533709   20319 docker_sandbox.go:658] Will attempt to re-write config file /var/vcap/store/docker/docker/containers/69cae77dd85e0c66b9c4374fffc04570556404a89146351252b5c4279084d925/resolv.conf with: 
[nameserver 10.100.200.10 search default.svc.cluster.local svc.cluster.local cluster.local options ndots:5]
I0424 18:32:19.533800   20319 plugins.go:412] Calling network plugin cni to set up pod "redis-cache-7f9b8ddc49-jzh5l_default"

After that I saw some messages from cni like follows:

1 2018-04-24T18:32:19.562Z  __main__ Initialized CNI configuration
                            __main__ ***_cni plugin invoked with arguments: ADD

Things got stuck at this point, and it was 100+ seconds later that I saw the cni came back with the network configuration that was required by the sandbox creation:

2018-04-24T18:34:02.884Z 656626b0-7baf-4d13-9d99-f8fb6635d44f [cni@6876 comp="***" subcomp="***_cni" level="INFO"] __main__ Pod networking configured on container 69cae77dd85e0c66b9c4374fffc04570556404a89146351252b5c4279084d925 (MAC address: 02:50:56:00:00:39, IP address: 30.0.62.4/24)
I0424 18:34:02.888755   20319 kuberuntime_manager.go:655] Created PodSandbox "69cae77dd85e0c66b9c4374fffc04570556404a89146351252b5c4279084d925" for pod "redis-cache-7f9b8ddc49-jzh5l_default(d1f95f27-47ed-11e8-9fd0-005056a1ac67)"

I further looked into the cni logs and got a bit understanding what was going on at a lower level in my network stack. It was actually caused by my router and switch configuration. I am posting this in case someone experiences similar issue.

Thanks,

-- Yong Li
Source: StackOverflow

4/20/2018

Let's look into the kubelet code that manages a Pod life-cycle.

Here is a function, which calls the code to mount/attach all Volumes, and one of the functions inside it generates SuccessfulMountVolume event:

if err := kl.volumeManager.WaitForAttachAndMount(pod); err != nil {
        kl.recorder.Eventf(pod, v1.EventTypeWarning, events.FailedMountVolume, "Unable to mount volumes for pod %q: %v", format.Pod(pod), err)
        glog.Errorf("Unable to mount volumes for pod %q: %v; skipping pod", format.Pod(pod), err)
        return err
    }

Right after that, kubelet fetches the pull secrets and actually launches a Pod:

// Fetch the pull secrets for the pod
pullSecrets := kl.getPullSecretsForPod(pod)

// Call the container runtime's SyncPod callback
result := kl.containerRuntime.SyncPod(pod, apiPodStatus, podStatus, pullSecrets, kl.backOff)

Now, let's find what kubelet does after you get the event with the message SuccessfulMountVolume and before Pulling:

  1. Update state of the volumes to mark volumes as Mounted.
  2. Fetch the pull secrets for the pod.
  3. Compute sandbox and container changes.
  4. Kill the pod if the sandbox has changed. It happens because kubelet creates and updates Pods in the same cycle, so if Pod already exists it will be killed and the new version will be created.
  5. Start to create a sandbox for the pod if necessary.
  6. Prepare config for the sandbox.
  7. Create logs directories.
  8. And finally, get the image.

As you can see, there are many steps which also include some internal steps before the volume mounting and the image pulling.

You can get more events about internal actions from the kubelet daemon by setting the log level to 4 (or more) using the -v option.

Hope it will help you to understand which step(s) is more time-consuming than you expected.

-- Anton Kostenko
Source: StackOverflow