How does Openwhisk decide how many runtime containers to create?

4/18/2020

I am working on a project that is using Openwhisk. I have created a Kubernetes cluster on Google cloud with 5 nodes and installed OW on it. My serverless function is written in Java. It does some processing based on arguments I pass to it. The processing can last up to 30 seconds and I invoke the function multiple times during these 30 seconds which means I want to have a greater number of runtime containers(pods) created without having to wait for the previous invocation to finish. Ideally, there should be a container for each invocation until the resources are finished.

Now, what happens is that when I start invoking the function, the first container is created, and then after few seconds, another one to serve the first two invocation. From that point on, I continue invoking the function (no more than 5 simultaneous invocation) but no containers are started. Then, after some time, a third container is created and sometimes, but rarely, a fourth one, but only after long time. What is even weirded is that the containers are all started on a single cluster node or sometimes on two nodes (always the same two nodes). The other nodes are not used. I have set up the cluster carefully. Each node is labeled as invoker. I have tried experimenting with memory assigned to each container, max number of containers, I have increased the max number of invocations I can have per minute but despite all this, I haven't been able to increase the number of containers created. Additionally, I have tried with different machines used for the cluster (different number of cores and memory) but it was in vain.

Since Openwhisk is still relatively a young project, I don't get enough information from the official documentation unfortunately. Can someone explain how does Openwhisk decide when to start a new container? What parameters can I change in values.yaml such that I achieve greater number of containers?

-- Nikola Vukovic
docker
google-kubernetes-engine
kubernetes
openwhisk

2 Answers

4/18/2020

The OpenWhisk scheduler implements several heuristics to map an invocation to a container. This post by Markus Thömmes https://medium.com/openwhisk/squeezing-the-milliseconds-how-to-make-serverless-platforms-blazing-fast-aea0e9951bd0 explains how container reuse and caching work and may be applicable for what you are seeing.

When you inspect the activation record for the invokes in your experiment, check the annotations on the activation record to determine if the request was "warm" or "cold". Warm means container was reused for a new invoke. Cold means a container was freshly allocated to service the invoke.

See this document https://github.com/apache/openwhisk/blob/master/docs/annotations.md#annotations-specific-to-activations which explains the meaning of waitTime and initTime. When the latter is decorating the annotation, the activation was "cold" meaning a fresh container was allocated.

It's possible your activation rate is not fast enough to trigger new container allocations. That is, the scheduler decided to allocate your request to an invoker where the previous invoke finished and could accept the new request. Without more details about the arrival rate or think time, it is not possible to answer your question more precisely.

Lastly, OpenWhisk is a mature serverless function platform. It has been in production since 2016 as IBM Cloud Functions, and now powers multiple public serverless offerings including Adobe I/O Runtime and Naver Lambda service among others.

-- user6062970
Source: StackOverflow

4/20/2020

The reason why very few containers were created is the fact that worker nodes do not have Docker Java runtime image and that it needs be downloaded on each of the nodes the first this environment is requested. This image weights a few hundred MBs and it needs time to be downloaded (a couple of seconds in google cluster). I don't know why Openwhisk controller decided to wait for already created pods to be available instead of downloading the image on other nodes. Anyway, once I downloaded the image manually on each of the nodes, using the same application with the same request rate, a new pod was created for each request that could not be served with an existing pod.

-- Nikola Vukovic
Source: StackOverflow