How do I specify standy/idle nodes for a k8s cluster?


I run a k8s cluster with autoscaling on AWS. I use the cluster as to run Spark (master + workers). Part of it is the following node-group for the worker nodes:

  - name: mng0
    instanceType: m5.large
    desiredCapacity: 0
    privateNetworking: false # if only 'Private' subnets are given, this must be enabled
    minSize: 4
    maxSize: 50
      attachIDs: xxxxx
        autoScaler: true
        cloudwatch: true

With this setup, I have always at least 4 nodes available for a 'warm start' in case a spark job comes in, to avoid the +-2 min. Now if nodes are requested by a second spark job (and >4 nodes are up allready), then the 2nd job again has to wait for more nodes be started. I want to create a situation where a new spark job is always picked up right away, without the 'starting a new node' overhead. This is especially relevant for me since the dataset size varies a lot (from MBs to TBs) and is used for exploratory analysis as well as ETL, where for exploratory analysis on small datasets I want a

Question: Can I specify a number of idle/waiting/standby nodes ready to accept new spark jobs?

Is this the right approach for achieving what I want, or is there a better approach?

-- marqram

0 Answers