I have 6 nodes,all of them have labels "group:emp",4 of them have labels "iKind:spot",2 of them have labels "ikind:normal".
I use the deployment yaml to assign one pod to the normal pod and others on the spot node, but it didn't work.
I start to increase the num of the pod from 1 to 6,but when it comes to 2,all the pod are assigned on th spot node
kind: Deployment
apiVersion: apps/v1
metadata:
name: pod-test
namespace: emp
labels:
app: pod-test
spec:
replicas: 2
selector:
matchLabels:
app: pod-test
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: pod-test
spec:
containers:
- name: pod-test
image: k8s.gcr.io/busybox
args: ["sh","-c","sleep 60000"]
imagePullPolicy: Always
resources:
requests:
cpu: 10m
memory: 100Mi
limits:
cpu: 100m
memory: 200Mi
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: group
operator: In
values:
- emp
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 70
preference:
matchExpressions:
- key: ikind
operator: In
values:
- spot
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- pod-test
topologyKey: ikind
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- pod-test
topologyKey: "kubernetes.io/hostname"
restartPolicy: Always
terminationGracePeriodSeconds: 10
dnsPolicy: ClusterFirst
schedulerName: default-scheduler
```
I add the node prefer matchExpressions to normal and give weight 30,and it works. In order to avoid the influence of the node nums,i change the weight of the normal and spot.
When replicas is 1,there is 1 pod in normal node
When replicas is 2,there is 1 pod in normal node and 1 pod in spot node
When replicas is 3,there is 2 pod in normal node and 1 pod in spot node
preferredDuringSchedulingIgnoredDuringExecution: - weight: 70 preference: matchExpressions: - key: ikind operator: In values: - normal - weight: 30 preference: matchExpressions: - key: ikind operator: In values: - spot
If you want to deploy pods on all nodes then you have to change your preferredDuringSchedulingIgnoredDuringExecution.
Change
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 70
preference:
matchExpressions:
- key: ikind
operator: In
values:
- spot
to
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 70
preference:
matchExpressions:
- key: ikind
operator: In
values:
- spot
- normal
Now it will be deployed on both nodes, with ikind:spot
and ikind:normal
, before it was only spot.
I have tested it on 3 gke nodes and everything seems working just fine.
pod-test-54dc97fbcb-9hvvm 1/1 Running gke-cluster-1-default-pool-1ffaf1b8-gmhb <none> <none>
pod-test-54dc97fbcb-k2hv2 1/1 Running gke-cluster-1-default-pool-1ffaf1b8-gmhb <none> <none>
pod-test-54dc97fbcb-nqd97 1/1 Running gke-cluster-1-default-pool-1ffaf1b8-7c25 <none> <none>
pod-test-54dc97fbcb-zq9df 1/1 Running gke-cluster-1-default-pool-1ffaf1b8-jk6t <none> <none>
pod-test-54dc97fbcb-zvwhk 1/1 Running gke-cluster-1-default-pool-1ffaf1b8-7c25 <none> <none>
It's well described here
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/e2e-az-name
operator: In
values:
- e2e-az1
- e2e-az2
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
containers:
- name: with-node-affinity
image: k8s.gcr.io/pause:2.0
This node affinity rule says the pod can only be placed on a node with a label whose key is kubernetes.io/e2e-az-name and whose value is either e2e-az1 or e2e-az2. In addition, among nodes that meet that criteria, nodes with a label whose key is another-node-label-key and whose value is another-node-label-value should be preferred.
Since both the spot nodes and normal nodes have the group=emp
label kubernetes scheduler could select a candidate node with label group=emp
which could be spot node or normal node and after that it will apply the preferredDuringSchedulingIgnoredDuringExecution
to prefer the spot node to schedule the pod. Depending on the capacity on the spot node scheduler may not be able to schedule it on spot node. Hence as an alternate choice scheduler is scheduling the pod on normal node.So you could either have different label for normal node and spot node and select based on the label of spot node or you can use requiredDuringSchedulingIgnoredDuringExecution
. This will put a hard constraint on the scheduler to schedule the pod on spot node only.