Horizontal Pod Autoscaling (HPA) with an initContainer that requires a Job

8/9/2021

I have a specific scenario where I'd like to have a deployment controlled by horizontal pod autoscaling. To handle database migrations in pods when pushing a new deployment, I followed this excellent tutorial by Andrew Lock here.

In short, you must define an initContainer that waits for a Kubernetes Job to complete a process (like running db migrations) before the new pods can run.

This works well, however, I'm not sure how to handle HPA after the initial deployment because if the system detects the need to add another Pod in my node, the initContainer defined in my deployment requires a Job to be deployed and run, but since Jobs are one-off processes, the pod can not initialize and run properly (a ttlSecondsAfterFinished attribute removes the Job anyways).

How can I define an initContainer to run when I deploy my app so I can push my database migrations in a Job, but also allow HPA to control dynamically adding a Pod without needing an initContainer?

Here's what my deployment looks like:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: graphql-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: graphql-pod
  template:
    metadata:
      labels:
        app: graphql-pod
    spec:
      initContainers:
        - name: wait-for-graphql-migration-job
          image: groundnuty/k8s-wait-for:v1.4 # This is an image that waits for a process to complete
          args:
            - job
            - graphql-migration-job # this job is defined next
      containers:
        - name: graphql-container
          image: image(graphql):tag(graphql)

The following Job is also deployed

apiVersion: batch/v1
kind: Job
metadata:
  name: graphql-migration-job
spec:
  ttlSecondsAfterFinished: 30
  template:
    spec:
      containers:
      - name: graphql-migration-container
        image: image(graphql):tag(graphql)
        command: ["npm", "run", "migrate:reset"]
      restartPolicy: Never

So basically what happens is:

  • I deploy these two resources to my node
  • Job is initialized
  • initContainer on Pod waits for Job to complete using an image called groundnuty/k8s-wait-for:v1.4
  • Job completes
  • initContainer completes
  • Pod initializes
  • (after 30 TTL seconds) Job is removed from node

(LOTS OF TRAFFIC)

  • HPA realizes a need for another pod
  • initContainer for NEW pod is started, but cant run because Job doesn't exist
  • ...crashLoopBackOff

Would love any insight on the proper way to handle this scenario!

-- Jordan Lewallen
horizontal-pod-autoscaling
kubernetes
kubernetes-pod

1 Answer

8/11/2021

There is, unfortunately, no simple Kubernetes feature to resolve your issue.

I recommend extending your deployment tooling/scripts to separate the migration job and your deployment. During the deploy process, you first execute the migration job and then deploy your deployment. Without the job attached, the HPA can nicely scale your pods.

There is a multitude of ways to achieve this:

  • Have a bash, etc. script first to execute the job, wait and then update your deployment
  • Leverage more complex deployment tooling like Helm, which allows you to add a 'pre-install hook' to your job to execute them when you deploy your application
-- Lukas Eichler
Source: StackOverflow