I have Nodejs services which run on AWS EKS cluster. The cluster has two worker nodes. service connects with another service.
Problem is, sometimes when I re-create the deployment, service fails to connect with another service, where it runs on another worker node.
EX: service 1 runs on worker node 1, service 2 runs on worker node 2. if service 1 runs as service1:3001(internal DNS) and when service 2 tries to connect that service, it fails
What I did to solve this time to time,
Solution 1: Update AWS Control-Plane
eksctl utils update-kube-proxy --name acceptance --approve
eksctl utils update-aws-node --name acceptance --approve
eksctl utils update-coredns --name acceptance --approve
eksctl update cluster --name acceptance --approve
Note: I did this once, when solution 2 and 3 did not work
Solution 2: Delete coredns pods and let start themself.
kubectl delete po coredns-workernode-1 coredns-workernode-2,
** Note:** I only do this if solution 3 does not work
Solution 3: Restarted the service again after I re-create the deployment.
These are solutions that I did to solve this connection failure between services.
Note: This does not happen when services run on the same worker node.
Moreinfo:
worker node AMI: ami-0b7127e7a2a38802a
EC2 Type: t2.medium
Kubernetes verson : 1.13.10
Seems those are temporary solutions and seems CoreDNS doesn't work properly. No more troubleshooting ideas to solve this permanently.