we are using GKE for NET Core containers with ASP. Each ASP container uses at least one inotify
instance (to watch Razer templates) and can use another to watch config files (if not explicitly disabled).
Linux default limit for number of inotify instances per host is 128 (fs.inotify.max_user_instances=128
). Some instances are consumed by kubernetes itself (e.g. fluend daemons). So when lots of pods are deployed on single host, host runs out of free inotify instances and containers are stuck in crash loop.
Since we use GKE, we cannot manage worker nodes and alter sysctl
settings directly.
My questions are:
As noted here, "Sysctls with no namespace are called node-level sysctls. If you need to set them, you must manually configure them on each node’s operating system, or by using a DaemonSet with privileged containers".
Regarding scheduling pods, there doesn't seem to be a way for the scheduler to take inotify or number of pods into account when scheduling. The scheduler is only aware of available resources (CPU and memory) and pod specs such as pod or node affinity.
To attain the kind of spread you are looking for will take a good deal of planning and use of both resource requests and pod affinity/anti-affinity. You can review this.