I have a single node Kubernetes cluster which shows 10Gi, 3 CPU as available(of total 16 Gi, 4CPU) for running the pods post the cluster startup. I am trying two different scenarios then:
Scenario-1.
Running 3 pods individually with configs(Request,Limit) as:
Pod-A: (1 Gi,3.3Gi) and (1 cpu,1 cpu)
Pod-B: (1 Gi,3.3Gi) and (1 cpu,1 cpu)
Pod-C: (1 Gi,3.3Gi) and (1 cpu,1 cpu)
In this scenario, apps get perfectly up in there corresponding pods and works fine as expected.
Scenario-2.
Running 3 pods individually with configs(Request,Limit) as:
Pod-A: (1 Gi,10 Gi) and (1 cpu,3 cpu)
Pod-B: (1 Gi,10 Gi) and (1 cpu,3 cpu)
Pod-C: (1 Gi,10 Gi) and (1 cpu,3 cpu)
In the second scenario, apps get up in there corresponding pods and but fails randomly after some load is put over any of these pods i.e. sometime Pod-A gets down, at times Pod-2 or Pod-3. At any point of time I am not able to run all the three pods together.
The only event I can see in the failed pod is as below
"The warning which is available in node logs says "Warning CheckLimitsForResolvConf 1m (x32 over 15m) kubelet, xxx.net Resolv.conf file '/etc/resolv.conf' contains search line consisting of more than 3 domains
!.".
Having only this information in logs, I am not able to figure out the actual reason for random failure of Pods.
Can anyone help me understand if there is anything wrong with the configs or is there something else I am missing?
Thanks
When you create a Pod, the Kubernetes scheduler selects a node for the Pod to run on. Each node has a maximum capacity for each of the resource types: the amount of CPU and memory it can provide for Pods. The scheduler ensures that, for each resource type, the sum of the resource requests of the scheduled Containers is less than the capacity of the node.
Note Although actual memory or CPU resource usage on nodes is very low, the scheduler still refuses to place a Pod on a node if the capacity check fails. This protects against a resource shortage on a node when resource usage later increases, for example, during a daily peak in request rate.
So after scheduling If a Container exceeds its memory request, it is likely that its Pod will be evicted whenever the node runs out of memory
Refer Default Hard Eviction Threshold values.
The kubelet has the following default hard eviction threshold:
memory.available<100Mi
nodefs.available<10%
nodefs.inodesFree<5%
imagefs.available<15%
You should track your Node Conditions when load is running.
kubelet maps one or more eviction signals to a corresponding node condition.
If a hard eviction threshold has been met, or a soft eviction threshold has been met independent of its associated grace period, the kubelet reports a condition that reflects the node is under pressure i.e MemoryPressure
or DiskPressure