I want to run a CronJob on my GKE in order to perform a batch operation on a daily basis. The ideal scenario would be for my cluster to scale to 0 nodes when the job is not running and to dynamically scale to 1 node and run the job on it every time the schedule is met.
I am first trying to achieve this by using a simple CronJob found in the kubernetes doc that only prints the current time and terminates.
I first created a cluster with the following command:
gcloud container clusters create $CLUSTER_NAME \
--enable-autoscaling \
--min-nodes 0 --max-nodes 1 --num-nodes 1 \
--zone $CLUSTER_ZONE
Then, I created a CronJob with the following description:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: Never
The job is scheduled to run every hour and to print the current time before terminating.
First thing, I wanted to create the cluster with 0 nodes but setting --num-nodes 0
results in an error. Why is it so? Note that I can manually scale down the cluster to 0 nodes after it has been created.
Second, if my cluster has 0 nodes, the job won't be scheduled because the cluster does not scale to 1 node automatically but instead gives the following error:
Cannot schedule pods: no nodes available to schedule pods.
Third, if my cluster has 1 node, the job runs normally but after that, the cluster won't scale down to 0 nodes but stay with 1 node instead. I let my cluster run for two successive jobs and it did not scale down in between. I assume one hour should be long enough for the cluster to do so.
What am I missing?
EDIT: I've got it to work and detailed my solution here.
I do not think it's a good idea to tweak GKE for this kind of job. If you really need 0 instances I'd suggest you use either
Update:
Note: Beginning with Kubernetes version 1.7, you can specify a minimum size of zero for your node pool. This allows your node pool to scale down completely if the instances within aren't required to run your workloads.
https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler
Old answer:
Scaling the entire cluster to 0 is not supported, because you always need at least one node for system pods:
You could create one node pool with a small machine for system pods, and an additional node pool with a big machine where you would run your workload. This way the second node pool can scale down to 0 and you still have space to run the system pods.
After attempting, @xEc mentions: Also note that there are scenarios in which my node pool wouldn't scale, like if I created the pool with an initial size of 0 instead of 1.
Initial suggestion:
Perhaps you could run a micro VM, with cron to scale the cluster up, submit a Job (instead of CronJob), wait for it to finish and then scale it back down to 0?