I would like to set up regional clusters in the kubernetes cluster underlying the Cloud Composer environments. I have seen in this question from October 2018 that the feature was not available as a part of Composer.
Google Cloud Composer with regional kubernetes cluster
The documentation however seems to imply that there is a way to set up an environment across different zones: "For simple use cases, you can create one environment in one region. For complex use cases, you can create multiple environments within a single region or across multiple regions".
I was wondering if there are examples of this type of setup, or alternatively if the wording of the documentation is referring to multiple separate composer environments, and if so how you would work with the scheduler to avoid running identical jobs across the multiple composer deployments.
https://cloud.google.com/composer/docs/concepts/overview#environments
I can’t comment on the specifics of Cloud Composer, but I think if you understand GKE Regional Clusters, it will help you understand.
By default, GKE uses "zonal" clusters, where each node-pool (and therefore the nodes) belong to a zone (like us-central1-a
).
In "regional" clusters you still have a single cluster. However, each GKE node-pool you create is replicated to 3 zones in that region (for example us-central1-a
, -b
, and -c
).
So when you create a "regional" cluster with 2 nodes, you will get 6 nodes in your cluster. These nodes will be created from 3 different node-pools that are spread on different zones in the same region.
If Cloud Composer runs on a GKE cluster, by having "regional" clusters, it will still see the whole thing as a single cluster and I'm guessing you would not end up running the same job multiple times.