I have a GKE cluster running (v1.12.8-gke.10). I am trying to set up a specific app that will work the way I want but I can't seem to find and documentation to piece it together. What I am trying to accomplish may not even be possible.
I would like to set up a deployment(1 pod) using a python docker image where it is running a looped pythons script performing checks. If the checks all pass, I would like this deployment/pod to start/scale another deployment that will do a simple task and then kill the pod that was started.
I am not sure if I should be using a deployment or if I need a HPA mixed somewhere in this process. I have also tried looking at KEDA but it only has specified triggers and doesn't fit what I am trying to do.
I am expecting two different deployments.
Deploy A = 1 pod constantly running a python script that is checking if it should be sending any commands to Deploy B.
Deploy B = listening for Deploy A to reach out to tell it to start a pod to run a task. After the task is completed, have the pod terminate.
The workflow you describe is possible. The controller would need access to the Kubernetes API, probably using the official Python client. When you received a request, you would create a Job, and probably pass information about what to run as command-line arguments. The process inside the Job's Pod would do the work and then exit normally. You'd then be responsible for monitoring the Job's status and noticing when it finished, but you wouldn't have to explicitly scale it down; deleting the completed Job would be polite.
The architecture I'd generally recommend here would be to use a job queue like RabbitMQ. You'd have a Deployment for your controller, and a separate Deployment for your worker, and a StatefulSet to run the job queue (or perhaps something like the stable/rabbitmq
Helm chart. None of these would directly interact with the Kubernetes API. When a new request came in, the controller would post a message to RabbitMQ, and when the worker received a message off the queue, it would do a job.
This has the advantage of being easier to develop locally (you can just run RabbitMQ on your laptop or in a container, but getting access to the Kubernetes API is harder). If you suddenly get swamped with a huge number of job submissions, you won't try to overload the cluster with thousands of jobs; they'll back up in RabbitMQ and you can do them one at a time. If you want the cluster to do more, you can kubectl scale deployment
to get more workers. If you run out of jobs the worker pod(s) will sit idle but that's not really a problem.