Jenkins is running on EKS and there are affinity rules in place on both the Jenkins main and worker pods.
The idea is to prevent the Jenkins worker pods from running on the same EKS worker nodes, where the Jenkins main pod is running.
The following rules work, until resources limits are pushed, at which point the Jenkins worker pods are scheduled onto the same EKS worker nodes as the Jenkins master pod.
Are there affinity / anti-affinity rules to prevent this from happening?
The rules in place for Jenkins main:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions: # assign to eks apps worker group
- key: node.app/group
operator: In
values:
- apps
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions: # don't assign to a node running jenkins main
- key: app.kubernetes.io/name
operator: In
values:
- jenkins
- key: app.kubernetes.io/component
operator: In
values:
- main
topologyKey: kubernetes.io/hostname
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions: # try not to assign to a node already running a jenkins worker
- key: app.kubernetes.io/name
operator: In
values:
- jenkins
- key: app.kubernetes.io/component
operator: In
values:
- worker
topologyKey: kubernetes.io/hostname
The rules in place for Jenkins worker:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions: # assign to eks apps worker group
- key: node.app/group
operator: In
values:
- apps
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions: # don't assign to a node running jenkins main
- key: app.kubernetes.io/name
operator: In
values:
- jenkins
- key: app.kubernetes.io/component
operator: In
values:
- main
topologyKey: kubernetes.io/hostname
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions: # try not to assign to a node already running a jenkins worker
- key: app.kubernetes.io/name
operator: In
values:
- jenkins
- key: app.kubernetes.io/component
operator: In
values:
- worker
topologyKey: kubernetes.io/hostname
So low and behold guess what...the main pod labels weren't set correctly.
Now you can see the selector lables displaying here:
> aws-vault exec nonlive-build -- kubectl get po -n cicd --show-labels
NAME READY STATUS RESTARTS AGE LABELS
jenkins-6597db4979-khxls 2/2 Running 0 4m8s app.kubernetes.io/component=main,app.kubernetes.io/instance=jenkins
To achieve this, new entries were added to the values file:
main:
metadata:
labels:
app.kubernetes.io/name: jenkins
app.kubernetes.io/component: main
And the Helm _helpers.tpl template was updated accordingly:
{{- define "jenkins.selectorLabels" -}}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- if .Values.main.metadata.labels }}
{{- range $k, $v := .Values.main.metadata.labels }}
{{ $k }}: {{ $v }}
{{- end }}
{{- end }}
{{- end }}