Often restarting of the pods because of networking issues

3/10/2017

Description:

We have a services what running on the Google Container Engine, based on the golang library go-micro and these services running fine, except random restarting during the day.

Problem:

Pods is restarting pretty often during the day. This affects to our services and core services like kube-dns or nginx-ingress. After checking of the logs, it looks like a networking problem, after this docker daemon and kubelet is restarting, and takes to restart our services. It might happen 10 times per day or 2 times per day. This is not constantly.

Details:

Version:

kubectl version                                                                          
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2",    GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T04:57:25Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.3", GitCommit:"029c3a408176b55c30846f0faedf56aae5992e9b", GitTreeState:"clean", BuildDate:"2017-02-15T06:34:56Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

OS:

uname -a
Linux microservices-g1-small-25eedb64-w265 4.4.21+ #1 SMP  Thu Nov 10 02:50:15 PST 2016 x86_64 Intel(R) Xeon(R) CPU @ 2.30GHz   GenuineIntel GNU/Linux

cat /etc/lsb-release
CHROMEOS_AUSERVER=https://tools.google.com/service/update2
CHROMEOS_RELEASE_BOARD=lakitu-signed-mpkeys
CHROMEOS_RELEASE_BRANCH_NUMBER=0
CHROMEOS_RELEASE_BUILDER_PATH=lakitu-release/R56-8977.0.0
CHROMEOS_RELEASE_BUILD_NUMBER=8977
CHROMEOS_RELEASE_BUILD_TYPE=Official Build
CHROMEOS_RELEASE_CHROME_MILESTONE=56
CHROMEOS_RELEASE_DESCRIPTION=8977.0.0 (Official Build) dev-channel lakitu 
CHROMEOS_RELEASE_NAME=Chrome OS
CHROMEOS_RELEASE_PATCH_NUMBER=0
CHROMEOS_RELEASE_TRACK=dev-channel
CHROMEOS_RELEASE_VERSION=8977.0.0
DEVICETYPE=OTHER
GOOGLE_RELEASE=8977.0.0
HWID_OVERRIDE=LAKITU DOGFOOD

Golang microservice framework go-micro

I tried to check the logs for figure out what happening and what i found:

rvices-g1-small-25eedb64-s0p6 update_engine[899]: [0310/064853:INFO:update_manager-inl.h(52)] ChromeOSPolicy::UpdateCheckAllowed: START Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 update_engine[899]: [0310/064908:WARNING:evaluation_context-inl.h(43)] Error reading Variable update_disabled: "No value set for update_disabled" Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 update_engine[899]: [0310/064932:WARNING:evaluation_context-inl.h(43)] Error reading Variable release_channel_delegated: "No value set for release_channel_delegated" Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 update_engine[899]: [0310/065015:INFO:chromeos_policy.cc(314)] Periodic check interval not satisfied, blocking until 3/10/2017 6:58:27 GMT Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 update_engine[899]: [0310/065025:INFO:update_manager-inl.h(74)] ChromeOSPolicy::UpdateCheckAllowed: END Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1435]: Docker daemon failed! Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1435]: Docker daemon failed! Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1435]: Docker daemon failed! Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1435]: Docker daemon failed! Mar 10 06:53:28 gke-microservices-g1-small-25eedb64-s0p6 metrics_daemon[903]: [INFO:upload_service.cc(103)] Metrics disabled. Don't upload metrics samples. Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1432]: okKubelet is unhealthy! Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:05.302107123Z" level=error msg="Force shutdown daemon" Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:17.997217 30078 helpers.go:101] Unable to get network stats from pid 27012: couldn't read network stats: failure opening /proc/27012/net/d Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.134978 30078 helpers.go:101] Unable to get network stats from pid 26236: couldn't read network stats: failure opening /proc/26236/net/d Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.135389 30078 helpers.go:101] Unable to get network stats from pid 27581: couldn't read network stats: failure opening /proc/27581/net/d Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.135801 30078 helpers.go:101] Unable to get network stats from pid 27581: couldn't read network stats: failure opening /proc/27581/net/d Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:18.430715 30078 prober.go:98] No ref for container "docker://4a90f704319f64738915bc353515403263a60ad04d5859174b50bb47c255db12" (social-syn Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.430740 30078 prober.go:106] Liveness probe for "social-sync-deployment-2745944389-rftmf_on-deploy-dev(80a79ba8-04b6-11e7-be05-42010 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:18.431064 30078 prober.go:98] No ref for container "docker://964f8ef2da5de63196f5ddfaec156f6b93fb05671be3dd7f2d90e4efb91cbd34" (heapster-v Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.431076 30078 prober.go:106] Liveness probe for "heapster-v1.2.0.1-1382115970-l9h4q_kube-system(7f0f2677-04b6-11e7-be05-42010af00129):he Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1432]: % Total % Received % Xferd Average Speed Time Time Time Current Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1432]: Dload Upload Total Spent Left Speed Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:12Z" level=info msg="stopping containerd after receiving terminated" Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:18.525414 30078 prober.go:98] No ref for container "docker://6fa84a9c20b7c8600048a98d06974817e85652b3b66b8c64d6390735de5bbf19" (kube-dns-4 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.525458 30078 prober.go:106] Readiness probe for "kube-dns-4101612645-bkt6z_kube-system(7f12f616-04b6-11e7-be05-42010af00129):kubedns" f Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: E0310 06:53:18.631190 30078 generic.go:197] GenericPLEG: Unable to retrieve pods: operation timeout: context deadline exceeded Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: E0310 06:53:18.646004 30078 container_manager_linux.go:625] error opening pid file /var/run/docker.pid: open /var/run/docker.pid: no such file or dire Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: E0310 06:53:18.893042 30078 kubelet_pods.go:710] Error listing containers: dockertools.operationTimeout{err:context.deadlineExceededError{}} Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: E0310 06:53:18.893091 30078 kubelet.go:1860] Failed cleaning pods: operation timeout: context deadline exceeded Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.947556 30078 logs.go:41] http: TLS handshake error from 127.0.0.1:39224: EOF Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:18.990182 30078 prober.go:98] No ref for container "docker://964f8ef2da5de63196f5ddfaec156f6b93fb05671be3dd7f2d90e4efb91cbd34" (heapster-v Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.990207 30078 prober.go:106] Liveness probe for "heapster-v1.2.0.1-1382115970-l9h4q_kube-system(7f0f2677-04b6-11e7-be05-42010af00129):he Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:18.990268 30078 prober.go:98] No ref for container "docker://4a90f704319f64738915bc353515403263a60ad04d5859174b50bb47c255db12" (social-syn Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1432]: [1.9K blob data] Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.043529322Z" level=error msg="Stop container error: Stop container d0c295d50409a171745524d6171a845fc3d29fd6db26da3fc883653fce1e4 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.077975854Z" level=error msg="Stop container error: Stop container 4712afe5f084cf3163bef94ac21e3d63a5179190e73a8a0fa906a59630b80 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.078034531Z" level=error msg="Stop container error: Stop container 1b18343beedfbe58403017fa532b85604c7ec2c96f15bd503747c19ac37f6 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.078074791Z" level=error msg="Stop container error: Stop container 1fb54295ff5ecc734bf12c576880131cb98011cb98e37b5fa982bdd257b69 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.078113450Z" level=error msg="Stop container error: Stop container b8e52eafa29a8b02263894b3d0d1371a92f1656fea981a6b9842c42b5d939 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.078150890Z" level=error msg="Stop container error: Stop container 9b9021078f15bc3ea03770c0c135e978326f8e279e60e9663885218070026 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:18.990280 30078 prober.go:106] Liveness probe for "social-sync-deployment-2745944389-rftmf_on-deploy-dev(80a79ba8-04b6-11e7-be05-42010 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: E0310 06:53:19.219709 30078 eviction_manager.go:204] eviction manager: unexpected err: failed ImageStats: failed to list docker images - operation tim Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:19.285843 30078 logs.go:41] http: TLS handshake error from 127.0.0.1:39414: write tcp 127.0.0.1:10250->127.0.0.1:39414: write: broken pipe Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:19.400005 30078 kubelet.go:1725] skipping pod synchronization - [container runtime is down] Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:19.400065 30078 prober.go:98] No ref for container "docker://6d63f67520d9b76446a00e1f6d81422f12f2fa93a1a9f85a656c0b49e457ba0c" (social-acc Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:19.400079 30078 prober.go:106] Liveness probe for "social-accounts-deployment-983093656-h9frj_on-deploy-dev(8071bfd6-04b6-11e7-be05-42 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:19.400318 30078 prober.go:98] No ref for container "docker://963021c2befd5e53a61c16ba2f7c97446b4c045bbf92f723e3b899c4fb2cde21" (post-metri Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:19.400333 30078 prober.go:106] Liveness probe for "post-metrics-deployment-556584274-z3p67_on-deploy-dev(7f9d4125-04b6-11e7-be05-42010 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: W0310 06:53:19.400476 30078 prober.go:98] No ref for container "docker://dc65f853b22eb25bdfaf1ce5bf1d0d6f48e57379caffa526f80a71b086d5247f" (notificati Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 health-monitor.sh[1432]: [1.9K blob data] Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.078188154Z" level=error msg="Stop container error: Stop container 8ee3de7c4dd56136b8c8a444f9b58316d190d2dad496472e233f23bf27596 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.078226785Z" level=error msg="Stop container error: Stop container a9fefcd23efb7f6472b209d6e383b8050da054c3f4b1ad2c6bf531f3b1475 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.078276076Z" level=error msg="Stop container error: Stop container 874fdb93aafc0a13bcbeada66f8f031cd52c01f0cec59913a49bf93917ce5 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.565783448Z" level=error msg="Stop container error: Stop container 42b9b796470a3a0a345229227cb7fa223967c56ce3b8e2765c3d9a48e963c Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.565846865Z" level=error msg="Stop container error: Stop container add6806333a7185aa4944b9bde0c9b2be973a09e59d2b80c09e98e549b180 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 docker[24076]: time="2017-03-10T06:53:13.565886676Z" level=error msg="Stop container error: Stop container 5631ba532f8b2a4ac262b97fabd2df07a8fe6b0202879e1347a763a5a8921 Mar 10 06:53:29 gke-microservices-g1-small-25eedb64-s0p6 kubelet[30078]: I0310 06:53:19.400485 30078 prober.go:106] Liveness probe for "notifications-deployment-3662335406-r668m_on-deploy-dev(880c38dc-0425-11e7-be05-420

At every time, when it trying to update a ChromeOS, it starts to occurs docker daemon issues, networking issues etc.

kube-proxy.log

I0310 06:53:17.392671 5 proxier.go:750] Deleting connection tracking state for service IP 10.3.240.10, endpoint IP 10.0.5.223 Flag --resource-container has been deprecated, This feature will be removed in a later releas e. I0310 06:54:12.615435 5 iptables.go:176] Could not connect to D-Bus system bus: dial un ix /var/run/dbus/system_bus_socket: connect: no such file or directory I0310 06:54:12.615488 5 server.go:168] setting OOM scores is unsupported in this build I0310 06:54:12.687932 5 server.go:215] Using iptables Proxier. I0310 06:54:12.690596 5 server.go:227] Tearing down userspace rules. I0310 06:54:12.690844 5 healthcheck.go:119] Initializing kube-proxy health checker I0310 06:54:12.702034 5 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072 I0310 06:54:12.702366 5 conntrack.go:66] Setting conntrack hashsize to 32768 I0310 06:54:12.702927 5 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_tim eout_established' to 86400 I0310 06:54:12.702951 5 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_tim eout_close_wait' to 3600 I0310 06:54:12.714134 5 proxier.go:802] Not syncing iptables until Services and Endpoin ts have been received from master

More logs:

g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:45.445978 3344 docker_manager.go:1975] Need to restart pod infra container for "roles-deployment-1745993421-qxf7z_on-a Mar 10 06:50:45 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:45.574227 3344 operation_executor.go:917] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/e257aff1-055d-1 Mar 10 06:50:45 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:45.575943 3344 docker_manager.go:1975] Need to restart pod infra container for "social-accounts-deployment-983093656-v Mar 10 06:50:45 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:45.774316 3344 operation_executor.go:917] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/e2762a4c-055d-1 Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:46.056277 3344 docker_manager.go:1975] Need to restart pod infra container for "tags-srv-deployment-626769860-js4h5_on Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-udevd[6680]: Could not generate persistent MAC address for veth37abc82a: No such file or directory Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: device veth37abc82a entered promiscuous mode Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 3(veth37abc82a) entered forwarding state Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 3(veth37abc82a) entered forwarding state Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-networkd[611]: veth37abc82a: Gained carrier Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:46.626937 3344 conversion.go:134] failed to handle multiple devices for container. Skipping Filesystem stats Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:46.627371 3344 conversion.go:134] failed to handle multiple devices for container. Skipping Filesystem stats Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-udevd[6745]: Could not generate persistent MAC address for veth07d02159: No such file or directory Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-networkd[611]: veth07d02159: Gained carrier Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: device veth07d02159 entered promiscuous mode Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 12(veth07d02159) entered forwarding state Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 12(veth07d02159) entered forwarding state Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-udevd[6771]: Could not generate persistent MAC address for veth2b02253d: No such file or directory Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-networkd[611]: veth2b02253d: Gained carrier Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: device veth2b02253d entered promiscuous mode Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 23(veth2b02253d) entered forwarding state Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 23(veth2b02253d) entered forwarding state Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-udevd[6796]: Could not generate persistent MAC address for veth55143c6b: No such file or directory Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-networkd[611]: veth55143c6b: Gained carrier Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: device veth55143c6b entered promiscuous mode Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 30(veth55143c6b) entered forwarding state Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 30(veth55143c6b) entered forwarding state Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-udevd[6821]: Could not generate persistent MAC address for vethe38b8eee: No such file or directory Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:46 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-networkd[611]: vethe38b8eee: Gained carrier Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kernel: device vethe38b8eee entered promiscuous mode Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 31(vethe38b8eee) entered forwarding state Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kernel: cbr0: port 31(vethe38b8eee) entered forwarding state Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:47.113442 3344 docker_manager.go:2236] Determined pod ip after infra change: "roles-deployment-1745993421-qxf7z Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:47.115417 3344 kubelet.go:1816] SyncLoop (PLEG): "social-accounts-deployment-983093656-vh2xt-deploy-dev(e257aff Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 docker[3264]: time="2017-03-10T06:50:47.118506356Z" level=error msg="Handler for GET /v1.23/images/b.gcr.io-container-registry/microservice Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kubelet[3344]: I0310 06:50:47.194220 3344 provider.go:119] Refreshing cache for provider: *gcp_credentials.dockerConfigKeyProvider Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-udevd[6847]: Could not generate persistent MAC address for veth2228e3ba: No such file or directory Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254). Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Network configuration changed, trying to establish connection. Mar 10 06:50:47 gke-microservices-g1-small-25eedb64-w265 systemd-timesyncd[570]: Synchronized to time server 169.254.169.254:123 (169.254.169.254).

Question: This is possible to avoid/reduce amount of restarts and solve networking issues to make our system more stable?

-- Motorcode
devops
go
go-micro
google-kubernetes-engine
kubernetes

1 Answer

3/10/2017

This is pretty interesting. While not a solution I would recommend:

Instances that have 0.5 or fewer cores, such as shared-core machine types, are treated as having 0.5 cores, and a network throughput cap of 1 Gbit/sec. Both persistent disk write I/O and network traffic count towards the instance's network cap. Depending on your needs, ensure your instance can support any desired persistent disk throughput for your applications. For more information, see the persistent disk specifications.

  • Start more kube-dns and nginx-ingress-controller replicas so you are less affected by single node failures
-- Janos Lenart
Source: StackOverflow