Network issues with docker containers having specific IP

9/12/2017

Problem

I have a problem with one IP address (172.17.0.11) in my docker network. Whenever a container gets this IP, outbound connections from the container stop working. When I kill this container:

  • I still can ping this IP despite no one is using it
  • There are no rules in iptables associated with this IP
  • I see a lot of established connections by docker-proxy in netstat for this IP but at the same time, other IPs from this list with dangling connections don't have any issues

It looks like IP conflict to me – curl doesn't work, wget and ping work very slowly probably because they re-establish the connection every time. This is not DNS issue, curl by IP doesn't work as well, what docker image used makes no difference.

Infrastructure

It's a single server setup on Debian 8 (4.9 kernel) with kubernetes 1.6.4 and docker-ce 17.06.1 (overlay2). This issue happened after I upgraded from 1.12.6 to 17.06.1

Please help me debug this issue.

docker version:

Client:
 Version:      17.06.1-ce
 API version:  1.30
 Go version:   go1.8.3
 Git commit:   874a737
 Built:        Thu Aug 17 22:53:31 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.06.1-ce
 API version:  1.30 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   874a737
 Built:        Thu Aug 17 22:51:25 2017
 OS/Arch:      linux/amd64
 Experimental: false

docker info:

Containers: 336
 Running: 336
 Paused: 0
 Stopped: 0
Images: 52
Server Version: 17.06.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 6e23458c129b551d5c9871e5174f6b1b7f6d1170
runc version: 810190ceaa507aa2727d7ae6f4790c76ec150bd2
init version: 949e6fa
Kernel Version: 4.9.0-0.bpo.3-amd64
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 28.76GiB
Name: host
ID: QY6I:JI2S:BOPG:FIQP:YEBB:3UYF:N3G2:COCQ:PX7Z:QRCV:GIEN:FGQC
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
-- chingis
docker
kubernetes
networking

2 Answers

9/21/2017

This issue was caused by desync between docker network (bridge plugin) and the actual state of the network on the host machine. The IP from docker network was released but the associated virtual interface and related tcp connections left intact. So when this IP attached to a new container, network anomalies started to happen.

Most likely this happened after random docker daemon hangs (happened with the older 1.12 version).

-- chingis
Source: StackOverflow

9/12/2017

Did you try rebooting the faulty node ? Looks like some namespace/bridge configuration might have gotten stuck.

-- Radek 'Goblin' Pieczonka
Source: StackOverflow