I'm running a pretty standard nginx ingress controller with an empty configmap. The nginx config gets reloaded (not restarted) every minute or so and I'm seeing workers pile up in the shutting down
state. Here is the output of ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 4240 796 ? Ss 14:26 0:00 /usr/bin/dumb-init /nginx-ingress-controller --default-backend-service=kube-system/default-http-backend --configmap=kube-system/nginx-ingre
root 8 3.3 0.6 102224 96204 ? Ssl 14:26 8:15 /nginx-ingress-controller --default-backend-service=kube-system/default-http-backend --configmap=kube-system/nginx-ingress-lb-conf
root 21 1.4 0.9 278328 153220 ? S 14:26 3:38 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
nobody 409 0.1 0.9 548800 150696 ? Sl 18:29 0:00 nginx: worker process is shutting down
nobody 410 0.1 0.9 548800 150760 ? Sl 18:29 0:00 nginx: worker process is shutting down
nobody 552 0.1 0.9 548800 150752 ? Sl 18:29 0:00 nginx: worker process is shutting down
nobody 695 0.3 0.9 548800 150808 ? Sl 18:29 0:00 nginx: worker process is shutting down
nobody 696 0.4 0.9 548800 150760 ? Sl 18:29 0:00 nginx: worker process is shutting down
nobody 697 0.4 0.9 548800 150864 ? Sl 18:29 0:00 nginx: worker process is shutting down
nobody 836 0.3 0.9 548800 150696 ? Sl 18:30 0:00 nginx: worker process is shutting down
nobody 837 0.3 0.9 548800 150680 ? Sl 18:30 0:00 nginx: worker process is shutting down
nobody 838 0.3 0.9 548800 150648 ? Sl 18:30 0:00 nginx: worker process is shutting down
nobody 839 0.3 0.9 548800 150652 ? Sl 18:30 0:00 nginx: worker process is shutting down
nobody 983 0.7 0.9 548800 150732 ? Sl 18:30 0:00 nginx: worker process is shutting down
nobody 984 0.9 0.9 548800 150816 ? Sl 18:30 0:00 nginx: worker process is shutting down
nobody 985 0.8 0.9 548800 150680 ? Sl 18:30 0:00 nginx: worker process is shutting down
nobody 986 0.8 0.9 548800 150784 ? Sl 18:30 0:00 nginx: worker process is shutting down
nobody 1120 1.0 0.9 548800 149592 ? Sl 18:31 0:00 nginx: worker process
nobody 1121 1.0 0.9 548800 150584 ? Sl 18:31 0:00 nginx: worker process
nobody 1122 0.5 0.9 548800 149364 ? Sl 18:31 0:00 nginx: worker process
nobody 1123 0.7 0.9 548800 150588 ? Sl 18:31 0:00 nginx: worker process
root 1262 29.0 0.4 169008 64084 ? S 18:31 0:00 /usr/sbin/nginx -s reload -c /etc/nginx/nginx.conf
root 1263 0.0 0.0 34432 2948 ? R+ 18:31 0:00 ps aux
root 29790 0.0 0.0 18252 3328 ? Ss 18:14 0:00 /bin/bash
nobody 32583 0.4 0.9 548800 150836 ? Sl 18:25 0:01 nginx: worker process is shutting down
nobody 32584 0.4 0.9 548800 150784 ? Sl 18:25 0:01 nginx: worker process is shutting down
nobody 32585 0.4 0.9 548800 150940 ? Sl 18:25 0:01 nginx: worker process is shutting down
nobody 32618 0.4 0.9 548800 150884 ? Sl 18:25 0:01 nginx: worker process is shutting down
nobody 32733 0.1 0.9 548800 150720 ? Sl 18:28 0:00 nginx: worker process is shutting down
nobody 32735 0.2 0.9 548800 150820 ? Sl 18:28 0:00 nginx: worker process is shutting down
As you can see, each worker uses up a lot of memory so the memory usage on this machine is getting out of control.
The worker-shutdown-timeout
setting is the default of 10s
but these workers are shutting down for well over 10s. Is anyone else running into this issue? Is there some way for me to troubleshoot this?
Here are redacted example lines from the logs
W0510 12:41:23.245036 7 backend_ssl.go:44] error obtaining PEM from secret quuz/quuz-quuz.com-tls: error retrieving secret quuz/quuz-quuz.com-tls: secret quuz/quuz-quuz.com-tls was not found
2018/05/10 12:41:23 [error] 27313#27313: connect() to [2600:1407:16::b832:eefa]:80 failed (101: Network is unreachable) while requesting certificate status, responder: ocsp.int-x3.letsencrypt.org, peer: [
2600:1407:16::b832:eefa]:80, certificate: "/ingress-controller/ssl/foo-foo-www.foo.com-tls.pem"
W0510 12:41:37.174784 7 controller.go:1064] ssl certificate for host bar.com is about to expire in 10 days
W0510 12:41:37.175872 7 controller.go:1047] ssl certificate "baz/baz-baz.com-tls" does not exist in local store
35.186.144.97 - [35.186.144.97] - - [10/May/2018:12:46:23 +0000] "GET / HTTP/1.1" 200 6272 "-" "GoogleStackdriverMonitoring-UptimeChecks(https://cloud.google.com/monitoring)" 365 0.097 [qux-qux
q-80] 10.0.33.184:80 20572 0.097 200
50.205.217.121 - [50.205.217.121] - - [10/May/2018:12:46:24 +0000] "GET /socket/websocket?jwtToken=REDACTED&gamePlayApiSession=REDACTED&vsn=2.0.0 HTTP/1.1" 403 0 "
-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36" 997 0.004 [quux-quux-80] 10.0.166.36:80 0 0.004 403