I am trying to deploy my sample Spring Boot micro service into Kubernetes cluster. My every nodes are showing ready state. And when I am trying to deploy, my pod is only showing ContainerCreating
.
And when I am describing the pod, then I am getting the message by saying networkPlugin cni failed to set up pod and network unable to allocate IP address
.
My pod describe
command result like the following:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned
default/spacestudysecurityauthcontrol-deployment-57596f4795-jxxvj to mildevkub040
Warning FailedCreatePodSandBox 53m kubelet, mildevkub040 Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "2499f91b4a1173fb854a47ba1910d1fc3f18cfb35bf5c38c9a3008e19d385e15" network for pod "spacestudysecurityauthcontrol-deployment-57596f4795-jxxvj": networkPlugin cni failed to set up pod "spacestudysecurityauthcontrol-deployment-57596f4795-jxxvj_default" network: unable to allocate IP address: Post http://127.0.0.1:6784/ip/2499f91b4a1173fb854a47ba1910d1fc3f18cfb35bf5c38c9a3008e19d385e15: dial tcp 127.0.0.1:6784: connect: connection refused, failed to clean up sandbox container "2499f91b4a1173fb854a47ba1910d1fc3f18cfb35bf5c38c9a3008e19d385e15" network for pod "spacestudysecurityauthcontrol-deployment-57596f4795-jxxvj": networkPlugin cni failed to teardown pod "spacestudysecurityauthcontrol-deployment-57596f4795-jxxvj_default" network: Delete http://127.0.0.1:6784/ip/2499f91b4a1173fb854a47ba1910d1fc3f18cfb35bf5c38c9a3008e19d385e15: dial tcp 127.0.0.1:6784: connect: connection refused]
Normal SandboxChanged 3m40s (x228 over 53m) kubelet, mildevkub040 Pod sandbox changed, it will be killed and re-created.
When I am checking the container weave log I am getting like the following,
INFO: 2020/01/09 12:18:12.061328 ->[192.168.16.178:42838] connection shutting down due to error during handshake: write tcp 192.168.16.177:6783->192.168.16.178:42838: write: connection reset by peer
INFO: 2020/01/09 12:18:18.998360 ->[192.168.16.178:37570] connection accepted
INFO: 2020/01/09 12:18:20.653339 ->[192.168.16.178:45223] connection shutting down due to error during handshake: write tcp 192.168.16.177:6783->192.168.16.178:45223: write: connection reset by peer
INFO: 2020/01/09 12:18:21.122204 overlay_switch ->[56:60:12:a9:76:d1(mildevkub050)] using fastdp
INFO: 2020/01/09 12:18:21.742168 ->[192.168.16.178:6783|56:60:12:a9:76:d1(mildevkub050)]: connection deleted
INFO: 2020/01/09 12:18:21.800670 ->[192.168.16.178:6783] attempting connection
INFO: 2020/01/09 12:18:22.470207 ->[192.168.16.175:59923] connection accepted
INFO: 2020/01/09 12:18:22.912690 ->[192.168.16.175:6783|be:b1:3f:a4:34:88(mildevkub020)]: connection deleted
INFO: 2020/01/09 12:18:22.918075 Removed unreachable peer be:b1:3f:a4:34:88(mildevkub020)
INFO: 2020/01/09 12:18:22.918144 Removed unreachable peer 56:60:12:a9:76:d1(mildevkub050)
INFO: 2020/01/09 12:18:24.602093 ->[192.168.16.175:6783] attempting connection
INFO: 2020/01/09 12:18:26.782123 ->[192.168.16.178:6783|56:60:12:a9:76:d1(mildevkub050)]: connection ready; using protocol version 2
INFO: 2020/01/09 12:18:27.918518 ->[192.168.16.175:59923|be:b1:3f:a4:34:88(mildevkub020)]: connection ready; using protocol version 2
INFO: 2020/01/09 12:18:29.365629 ->[192.168.16.178:37570|56:60:12:a9:76:d1(mildevkub050)]: connection ready; using protocol version 2
INFO: 2020/01/09 12:18:29.864370 overlay_switch ->[56:60:12:a9:76:d1(mildevkub050)] using fastdp
INFO: 2020/01/09 12:18:30.086645 overlay_switch ->[56:60:12:a9:76:d1(mildevkub050)] using fastdp
INFO: 2020/01/09 12:18:30.090275 overlay_switch ->[be:b1:3f:a4:34:88(mildevkub020)] using fastdp
INFO: 2020/01/09 12:18:30.100874 ->[192.168.16.178:37570|56:60:12:a9:76:d1(mildevkub050)]: connection added (new peer)
INFO: 2020/01/09 12:18:30.104237 ->[192.168.16.178:37570|56:60:12:a9:76:d1(mildevkub050)]: connection deleted
INFO: 2020/01/09 12:18:30.104284 ->[192.168.16.178:6783|56:60:12:a9:76:d1(mildevkub050)]: connection added (new peer)
INFO: 2020/01/09 12:18:30.104371 ->[192.168.16.175:59923|be:b1:3f:a4:34:88(mildevkub020)]: connection added (new peer)
INFO: 2020/01/09 12:18:30.776275 ->[192.168.16.178:37570|56:60:12:a9:76:d1(mildevkub050)]: connection shutting down due to error: Multiple connections to 56:60:12:a9:76:d1(mildevkub050) added to 5a:67:92:b3:58:ce(mildevkub040)
INFO: 2020/01/09 12:18:44.305079 ->[192.168.16.175:6783|be:b1:3f:a4:34:88(mildevkub020)]: connection ready; using protocol version 2
INFO: 2020/01/09 12:18:45.200565 overlay_switch ->[be:b1:3f:a4:34:88(mildevkub020)] using fastdp
INFO: 2020/01/09 12:18:45.458203 ->[192.168.16.175:59923|be:b1:3f:a4:34:88(mildevkub020)]: connection fully established
INFO: 2020/01/09 12:18:45.461157 ->[192.168.16.175:6783|be:b1:3f:a4:34:88(mildevkub020)]: connection shutting down due to error: Multiple connections to be:b1:3f:a4:34:88(mildevkub020) added to 5a:67:92:b3:58:ce(mildevkub040)
INFO: 2020/01/09 12:18:45.470667 ->[192.168.16.178:6783|56:60:12:a9:76:d1(mildevkub050)]: connection fully established
INFO: 2020/01/09 12:18:45.688871 sleeve ->[192.168.16.178:6783|56:60:12:a9:76:d1(mildevkub050)]: Effective MTU verified at 1438
INFO: 2020/01/09 12:18:45.874380 sleeve ->[192.168.16.175:6783|be:b1:3f:a4:34:88(mildevkub020)]: Effective MTU verified at 1438
INFO: 2020/01/09 12:24:12.026645 ->[192.168.16.178:6783|56:60:12:a9:76:d1(mildevkub050)]: connection shutting down due to error: write tcp 192.168.16.177:38313->192.168.16.178:6783: write: connection reset by peer
INFO: 2020/01/09 12:25:56.708405 ->[192.168.16.178:44120] connection accepted
INFO: 2020/01/09 12:26:31.769826 overlay_switch ->[56:60:12:a9:76:d1(mildevkub050)] sleeve timed out waiting for UDP heartbeat
INFO: 2020/01/09 12:26:41.819554 ->[192.168.16.175:59923|be:b1:3f:a4:34:88(mildevkub020)]: connection shutting down due to error: write tcp 192.168.16.177:6783->192.168.16.175:59923: write: connection reset by peer
INFO: 2020/01/09 12:28:17.563133 ->[192.168.16.178:6783|56:60:12:a9:76:d1(mildevkub050)]: connection deleted
INFO: 2020/01/09 12:30:49.548347 ->[192.168.16.178:60937] connection accepted
When I am running the command kubectl exec -n kube-system weave-net-fj9mm -c weave -- /home/weave/weave --local status ipam
, I am getting the response like "Error from server (NotFound): pods "weave-net-fj9mm" not found"
How I can resolve this issue?
The url that is appearing in the pod describe
command, if you curl it. You will get something like this.
# curl 'http://127.0.0.1:6784/status'
Version: 1.8.2 (version 1.9.1 available - please upgrade!)
Service: router
Protocol: weave 1..2
Name: 66:2b:6a:ca:34:88(ip-10-128-152-185)
Encryption: disabled
PeerDiscovery: enabled
Targets: 4
Connections: 4 (3 established, 1 failed)
Peers: 4 (with 12 established connections)
TrustedSubnets: none
Service: ipam
Status: waiting for IP range grant from peers
Range: 10.32.0.0/12
DefaultSubnet: 10.32.0.0/12
"waiting for IP range grant from peers" status indicates that Weave Net's IPAM believes that all the IP address space is owned by other nodes in the cluster, but actually none of those nodes are able to be contacted at the moment.
Here's the workaround. Big red warnings:
% for i in $(curl -s 'http://127.0.0.1:6784/status/ipam' | grep 'unreachable\!#x27; | sort -k2 -n -r | awk -F'(' '{print $2}' | sed 's/).*//'); do echo curl -X DELETE 127.0.0.1:6784/peer/$i; done
65536 IPs taken over from ip-10-128-184-15
32768 IPs taken over from ip-10-128-159-154
32768 IPs taken over from ip-10-128-170-84
Reference - https://github.com/weaveworks/weave/issues/2822