How can I troubleshoot/fix an issue interacting with a running Kubernetes pod (timeout error)?

1/24/2019

I have two EC2 instances, one running a Kubernetes Master node and the other running the Worker node. I can successfully create a pod from a deployment file that pulls a docker image and it starts with a status of "Running". However when I try to interact with it I get a timeout error.

Ex: kubectl logs <pod-name> -v6

Output:

Config loaded from file /home/ec2-user/.kube/config GET https://<master-node-ip>:6443/api/v1/namespaces/default/pods/<pod-name> 200 OK in 11 milliseconds GET https://<master-node-ip>:6443/api/v1/namespaces/default/pods/<pod-name>/log 500 Internal Server Error in 30002 milliseconds Server response object: [{"status": "Failure", "message": "Get https://<worker-node-ip>:10250/containerLogs/default/<pod-name>/<container-name>: dial tcp <worker-node-ip>:10250: i/o timeout", "code": 500 }]

I can get information about the pod by running kubectl describe pod <pod-name> and confirm the status as Running. Any ideas on how to identify exactly what is causing this error and/or how to fix it?

-- Nick
amazon-ec2
kubernetes

1 Answer

1/29/2019

Probably, you didn't install any network add-on to your Kubernetes cluster. It's not included in kubeadm installation, but it's required to communicate between pods scheduled on different nodes. The most popular are Calico and Flannel. As you already have a cluster, you may want to chose the network add-on that uses the same subnet as you stated with kubeadm init --pod-network-cidr=xx.xx.xx.xx/xx during cluster initialization.

  • 192.168.0.0/16 is default for Calico network addon
  • 10.244.0.0/16 is default for Flannel network addon

You can change it by downloading corresponded YAML file and by replacing the default subnet with the subnet you want. Then just apply it with kubectl apply -f filename.yaml

-- VAS
Source: StackOverflow