We're running a Kubernetes cluster that has an autoscaler, which, as far as I can tell works perfectly most of the time. When we change the replica count of a given deployment that would exceed the resources of our cluster, the autoscaler catches it and scales up. Likewise we get a scale down if we need fewer resources.
That is until today when some of the pods our Airflow deployment stopped working because they can't get the resources required. Rather than triggering a cluster scale up, the pods immediately fail or are evicted for trying to ask for or use more resources than are available. See the YAML output of one of the failing pods below. The pods also never appear as Pending
: they skip immediately from launch to their failed state.
Is there something that I'm missing in terms of some kind of retry tolerance that would trigger the pod to be pending and thus wait for a scale up?
apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/psp: eks.privileged
creationTimestamp: "2019-12-02T22:41:19Z"
name: ingest-customer-ff06ae4d
namespace: airflow
resourceVersion: "32545690"
selfLink: /api/v1/namespaces/airflow/pods/ingest-customer-ff06ae4d
uid: dba8b4c1-1554-11ea-ac6b-12ff56d05229
spec:
affinity: {}
containers:
- args:
- scripts/fetch_and_run.sh
env:
- name: COMPANY
value: acme
- name: ENVIRONMENT
value: production
- name: ELASTIC_BUCKET
value: customer
- name: ELASTICSEARCH_HOST
value: <redacted>
- name: PATH_TO_EXEC
value: tools/storage/store_elastic.py
- name: PYTHONWARNINGS
value: ignore:Unverified HTTPS request
- name: PATH_TO_REQUIREMENTS
value: tools/requirements.txt
- name: GIT_REPO_URL
value: <redacted>
- name: GIT_COMMIT
value: <redacted>
- name: SPARK
value: "true"
image: dkr.ecr.us-east-1.amazonaws.com/spark-runner:dev
imagePullPolicy: IfNotPresent
name: base
resources:
limits:
memory: 28Gi
requests:
memory: 28Gi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /mnt/ssd
name: tmp-disk
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-cgpcc
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
hostNetwork: true
priority: 0
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- emptyDir: {}
name: tmp-disk
- name: default-token-cgpcc
secret:
defaultMode: 420
secretName: default-token-cgpcc
status:
conditions:
- lastProbeTime: "2019-12-02T22:41:19Z"
lastTransitionTime: "2019-12-02T22:41:19Z"
message: '0/9 nodes are available: 9 Insufficient memory.'
reason: Unschedulable
status: "False"
type: PodScheduled
phase: Pending
qosClass: Burstable