I am trying to understand Deployment scaling with Load Balancer. I created Kubernetes cluster on Google Cloud with 6 nodes: 2 cores and 13Gb RAM each (n1-highmem-2) and launch 5 Pods and 1 Load Balancer service. Each Pod has a limit specified to 5.1Gb and 1cpu. When I tried to scale my deployment to 10 Pods I got an error that my CPU number is too low. How? I have cluster with 12 cores in total and 78Gb of RAM. Here is my yaml file:
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
name: production
---
apiVersion: v1
kind: Service
metadata:
name: my-service
namespace: production
labels:
run: mypod
spec:
type: LoadBalancer
ports:
- port: 8050
targetPort: 8050
protocol: TCP
name: http
selector:
run: mypod
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: test
namespace: production
spec:
selector:
matchLabels:
run: mypod
replicas: 5
template:
metadata:
namespace: production
labels:
run: mypod
spec:
containers:
- name: test
image: my-hello-world
ports:
- containerPort: 8050
resources:
limits:
cpu: "1"
memory: "5.1Gi"
requests:
cpu: "1"
memory: "500Mi"
Other containers may be requesting CPU from your cluster (including the kube-system
ones).
You are requesting 1 CPU for each test
container replica, but keep in mind that each container must be scheduled in one of the nodes (being that every single node only has 2 CPU available). That means: If a node has a single kube-system
container that is requesting any amount of CPU, the node cannot afford more than one test
container. E.g.:
Node 1:
- calico-node-rqcw7 - 250m
- test-83h1d - 1000m
- test-kd93h - 1000m # <----- This one cannot be scheduled because the node already is using 1250m
Use kubectl describe nodes
command and you should figure out what containers are being scheduled in which nodes, including their CPU requests.