TL;DR: NodeSelector ignores nodes from another NodePool. How to distribute pods across more NodePools using a label nodeSelector or other technique?
I have two nodePools like this:
...
# Spot node pool
resource "azurerm_kubernetes_cluster_node_pool" "aks_staging_np_compute_spot" {
name = "computespot"
(...)
vm_size = "Standard_F8s_v2"
max_count = 2
min_count = 2
(...)
priority = "Spot"
eviction_policy = "Delete"
(...)
node_labels = {
"pool_type" = "compute"
}
# Regular node pool
resource "azurerm_kubernetes_cluster_node_pool" "aks_staging_np_compute_base" {
name = "computebase"
(...)
vm_size = "Standard_F8s_v2"
max_count = 2
min_count = 2
node_labels = {
"pool_type" = "compute"
}
Both pools are deployed in AKS and all the nodes are present in OK state. Please note two things:
pool_type: compute
Standard_F8s_v2
(There are also 20 other nodes with different labels in my cluster which are not important.)
Then I've got a deployment like this (omitted irrelevant lines for brevity):
apiVersion: apps/v1
kind: Deployment
metadata:
(...)
spec:
replicas: 4
selector:
matchLabels:
app: myapp
template:
(...)
spec:
nodeSelector:
pool_type: compute
(...)
containers:
(...)
There is also entry in tolerations
for accepting Azure spot instances. It apparently works.
tolerations:
- key: "kubernetes.azure.com/scalesetpriority"
operator: "Equal"
value: "spot"
effect: "NoSchedule"
The problem is that the app gets deployed only on one nodepool ("computespot"
in this case) and never touches the another (computebase
). Even when the label and the size of the individual nodes are same.
computespot
nodes, one node per each. 0/24 nodes are available: 14 Insufficient cpu, 17 Insufficient memory, 4 node(s) didn't match node selector.
That's a absolute lie because I can see the computebase
nodes just sitting there entirely empty.How this can be solved?
Found a solution using pod affinity.
spec:
# This didn't work:
#
# nodeSelector:
# pool_type: compute
#
# But this does:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: pool_type
operator: In
values:
- compute
I don't know the reason because we're still dealing with one single label. If someone knows, please share.