why could not access service from another node in kubernetes

4/22/2021

Today my pod could not start and show this error:

2021-04-22 12:41:26.325  WARN 1 --- [ngPollService-1] c.c.f.a.i.RemoteConfigLongPollService    : Long polling failed, will retry in 64 seconds. appId: 0010010006, cluster: default, namespaces: TEST1.RABBITMQ_CONFIG_REPORT+TEST1.RABBITMQ-CONFIG+application+TEST1.EUREKA+TEST1.DATASOURCE-DRUID+TEST1.COMMON_CONFIG+TEST1.REDIS-CONFIG, long polling url: null, reason: Get config services failed from http://service-apollo-config-server-test-alpha.sre.svc.cluster.local:8080/services/config?appId=0010010006&ip=172.30.184.11 [Cause: Could not complete get operation [Cause: Connection refused (Connection refused)]]

this error tell me this pod could not access the config service, and fetch config failed from config center, so it could not start. Then I login to another node(work fine node) pod and curl the config pod like this:

curl http://service-apollo-config-server-test-alpha.sre.svc.cluster.local:8080

works fine. so the config service is ok. now I run the same command in the problem node pod:

bash-4.4# curl http://service-apollo-config-server-test-alpha.sre.svc.cluster.local:8080
curl: (7) Failed to connect to service-apollo-config-server-test-alpha.sre.svc.cluster.local port 8080: Connection refused
bash-4.4# curl http://service-apollo-config-server-test-alpha.sre.svc.cluster.local:8080

and I ping the config node like this from problem node, works fine :

ping service-apollo-config-server-test-alpha.sre.svc.cluster.local

then I scan the config node using nmap from problem node:

bash-4.4# nmap service-apollo-config-server-test-alpha.sre.svc.cluster.local
Starting Nmap 7.70 ( https://nmap.org ) at 2021-04-22 12:45 CST
Nmap scan report for service-apollo-config-server-test-alpha.sre.svc.cluster.local (10.254.82.131)
Host is up (0.000010s latency).
Not shown: 996 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
111/tcp  open  rpcbind
3306/tcp open  mysql
8443/tcp open  https-alt

did not found the 8080 port. seems network is fine but could not access the service from node. why the problem node pod could not access the config service? what should I do to find out the problem and fix it? I found on the problem node using pod ip it could work, for example:

    # pod ip access works
    curl 172.30.112.2:11025
    # service ip failed 
    curl 10.254.94.209:11025
    # service name failed
    curl soa-illidan-superhub.dabai-fat.svc.cluster.local:11025
-- Dolphin
kubernetes

1 Answer

4/22/2021

Finally I found the kube-proxy process was exit, in CentOS 7.6, using this command to start:

systemctl start kube-proxy

fix it.

-- Dolphin
Source: StackOverflow