MountVolume.SetUp failed for volume "policy-adapter-secret" : couldn't propagate object cache: timed out waiting for the condition

7/5/2019

Created a two node cluster with kubeadm.

Installed istio 1.1.11

kubectl version

Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:40:16Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:32:14Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}

Executed the commands as given in istio documentation

$for i in install/kubernetes/helm/istio-init/files/crd*yaml; do kubectl apply -f $i; done

$ kubectl apply -f install/kubernetes/istio-demo.yaml

Services got created.

$ kubectl get pods -n istio-system

Telemetry and policy pod status turned to CrashLoopBackOff status

istio-policy-648b5f5bb5-dv5np                 1/2        **CrashLoopBackOff**      5          2m52s

istio-telemetry-57946b8569-9m7gd           1/2     **CrashLoopBackOff**   5          2m52s

While describing the pod, getting the following error

 Warning  FailedMount  2m16s (x2 over 2m18s)  kubelet, ip-xxx-xxx-xxx-xxx  MountVolume.SetUp failed for volume "policy-adapter-secret" : couldn't propagate object cache: timed out waiting for the condition

Tried restarting the VM, restarted docker service. It did not help.

Because of the above error, the pod repeatedly try to restart and then crash.

Need your help in resolving this

-- Sen
istio
kubectl
kubernetes

2 Answers

7/18/2019

Around the network you can find many issues related to couldn't propagate object cache: timed out waiting for the condition. There is already opened issue on Github - https://github.com/kubernetes/kubernetes/issues/70044

As one of many steps to resolve it please try to:

  • Reboot cluster
  • Restart docker deamon
  • Update your OS

In case related to ISTIO I have tried it on Kubeadm, Minikube and GKE. In all cases istio-policy-XXX-XXX and istio-telemetry-XXX-XXX were restarted due to liveness proble fail.

telemetry example
---
  Warning  Unhealthy  8m49s (x9 over 9m29s)  kubelet, gke-istio-default-pool-c41459f8-zbhn  Liveness probe failed: Get http://10.56.0.6:15014/version: dial tcp 10.56.0.6:15014: connect: connection refused
  Normal   Killing    8m49s (x3 over 9m19s)  kubelet, gke-istio-default-pool-c41459f8-zbhn  Killing container with id docker://mixer:Container failed liveness probe.. Container will be killed and recreated.

policy example
---
  Warning  Unhealthy  7m28s (x9 over 8m8s)   kubelet, gke-istio-default-pool-c41459f8-3c6d  Liveness probe failed: Get http://10.56.2.6:15014/version: dial tcp 10.56.2.6:15014: connect: connection refused
  Normal   Killing    7m28s (x3 over 7m58s)  kubelet, gke-istio-default-pool-c41459f8-3c6d  Killing container with id docker://mixer:Container failed liveness probe.. Container will be killed and recreated.

Even in documentation example you can observe that telemetry and policy were restarted 2 times.

After verification both YAMLs (istio-demo.yaml and istio-demo-auth.yaml) i found that telemetry and policy Deployments have set liveness probe to 5 seconds.

livenessProbe:
          httpGet:
            path: /version
            port: 15014
          initialDelaySeconds: 5
          periodSeconds: 5

If you will use kubectl logs on mixer container from istio-telemetry pod you might be able to see some errors like

2019-07-18T15:16:01.887334Z     info    pickfirstBalancer: HandleSubConnStateChange: 0xc420751a80, CONNECTING
...
2019-07-18T15:16:21.887741Z     info    pickfirstBalancer: HandleSubConnStateChange: 0xc420751a80, TRANSIENT_FAILURE
2019-07-18T15:16:21.887862Z     error   mcp     Failed to create a new MCP sink stream: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 10.0.25.27:9901: i/o timeout"
...
2019-07-18T15:16:44.282027Z     info    pickfirstBalancer: HandleSubConnStateChange: 0xc420751a80, CONNECTING
2019-07-18T15:16:44.287281Z     info    pickfirstBalancer: HandleSubConnStateChange: 0xc420751a80, READY
2019-07-18T15:16:44.888794Z     info    mcp     (re)trying to establish new MCP sink stream
2019-07-18T15:16:44.888922Z     info    mcp     New MCP sink stream created

So in short, mixer container in both (telemetry and policy) deployments need about 44 seconds to establish all connections.

If you will change initialDelaySeconds: to 60 seconds in both Deployments, pods should not be restarted due to liveness probe.

Hope it helps

-- PjoterS
Source: StackOverflow

7/7/2019

These Mixer services may be crashlooping if your node(s) don't have enough memory to run Istio. More and more, people use tools like Meshery to install Istio (and other service meshes), because it will highlight points of contention like that of memory. When deploying either the istio-demo or istio-demo-auth configuration profiles, you'll want to ensure you have a minimum of 4GB RAM per node (particularly, if the Istio control plane is only deployed to one node).

-- Lee Calcote
Source: StackOverflow