Start kubernetes pod memory depending on size of data job

6/28/2018

is there a way to scale dynamically the memory size of Pod based on size of data job (my use case)?

Currently we have Job and Pods that are defined with memory amounts, but we wouldn't know how big the data will be for a given time-slice (sometimes 1000 rows, sometimes 100,000 rows).
So it will break if the data is bigger than the memory we have allocated beforehand.

I have thought of using slices by data volume, i.e. cut by every 10,000 rows, we will know memory requirement of processing a fixed amount of rows. But we are trying to aggregate by time hence the need for time-slice.

Or any other solutions, like Spark on kubernetes?

Another way of looking at it:
How can we do an implementation of Cloud Dataflow in Kubernetes on AWS

-- cryanbhu
apache-beam
apache-spark
apache-spark-sql
google-cloud-dataflow
kubernetes

3 Answers

7/11/2018

If you don’t know the memory requirement for your pod a priori for a given time-slice, then it is difficult for Kubernete Cluster Autoscaler to automatically scale node pool for you as per this documentation [1]. Therefore for both of your suggestions like running either Cloud Dataflow or Spark on Kubernete with Kubernete Cluster Autoscaler, may not work for your case.

However, you can use custom scaling as a workaround. For example, you can export memory related metrics of the pod to Stackdriver, then deploy HorizontalPodAutoscaler (HPA) resource to scale your application as [2].

[1] https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler#how_cluster_autoscaler_works

[2] https://cloud.google.com/kubernetes-engine/docs/tutorials/custom-metrics-autoscaling

-- Kevin Chien
Source: StackOverflow

7/6/2018

It's a best practice always define resources in your container definition, in particular:

  • limits:the upper level of CPU and memory
  • requests: the minimal level of CPU and memory

This allows the scheduler to take a better decision and it eases the assignment of Quality of Service (QoS) for each pod (https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/) which falls into three possible classes:

  • Guaranteed (highest priority): when requests = limits
  • Burstable: when requests < limits
  • BestEffort (lowest priority): when requests and limits are not set.

The QoS enables a criterion for killing pods when the system is overcommited.

-- Nicola Ben
Source: StackOverflow

7/2/2018

I have found the partial solution to this.
Note there are 2 parts to this problem.
1. Make the Pod request the correct amount of memory depending on size of data job
2. Ensure that this Pod can find a Node to run on.

The Kubernetes Cluster Autoscaler (CA) can solve part 2.
https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler

According to the readme:
Cluster Autoscaler is a tool that automatically adjusts the size of the Kubernetes cluster when there are pods that failed to run in the cluster due to insufficient resources.

Thus if there is a data job that needs more memory than available in the currently running nodes, it will start a new node by increasing the size of a node group.
Details:
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md

I am still unsure how to do point 1.

An alternative to point 1, start the container without specific memory request or limit: https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/#if-you-don-t-specify-a-memory-limit

If you don’t specify a memory limit for a Container, then one of these situations applies:

The Container has no upper bound on the amount of memory it uses. 
or
The Container could use all of the memory available on the Node where it is running.
-- cryanbhu
Source: StackOverflow