Druid & GKE - GKE node upgrade make druid historicals pods to download the entire segments from GCS bucket

7/8/2019

We have a stateful set for a service (Druid historicals) that caches a lot of data (TB) on local SSDs. This service has one to one mapping with the node using pod anti-affinity. When we want to upgrade the GKE version, will migrate the historical pods to the new set of nodes (new GKE node pool), this means pods startup with empty local disks and they then take a while to refill their caches (~ 5 to 6 hours). We ideally only want to do planned replacement of nodes (eg, GKE Node pool upgrade) one node at a time and wait until the pod on the new node has fully filled up its cache before rolling out the next node. Could anyone please suggest how can we make sure that data is fully downloaded from Deep storage bucket before moving to the next node upgrade or is there any way we can avoid the situation which downloading the entire data from S3.

-- Rakesh
druid
google-cloud-platform
google-kubernetes-engine
kubernetes
stateful

0 Answers