I've got a CPU-intensive serverless function deployed on a Kubernetes cluster (running on OpenFaaS). What I'd like to achieve is that whenever my function is invoked, a new node spins up in my cluster to execute the process, as multiple processes might be executed at a given time. I've so far created an HPA, set to 70% of CPU utilisation:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: 4d-as
namespace: openfaas-fn
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: process-layer
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 70
I've also added a podAntiAffinity
rule to the deployment make sure only a single replica is deployed on a given node:
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- process-layer
topologyKey: kubernetes.io/hostname
The deployment also has resource
configuration settings:
requests:
cpu: 400m
memory: 1000Mi
limits:
cpu: 1000m
memory: 2000Mi
So far I was able to get new nodes spun up, with a replica of my function deployed on each. The desired outcome is that each node executes the serverless function isolated from each other. For example:
Function invoked with parameter A -> Node 1 executes the function with parameter A
Function invoked with parameter B -> Node 2 executes the function with parameter B
Function invoked with parameter C -> Node 3 executes the function with parameter C
...
Instead what I'm seeing is that Node 1
executes the function with parameter A
, it then starts to execute the function with parameter B
. After CPU utilisation goes above 70%, Node 2
spins up and starts to execute my function multiple times, using parameters A
and B
.
What I'd like instead is that Node 1
executes my function using parameter A
only, Node 2
executes my function using parameter B
only, and so on.
Is there a way to achieve the above?
I think you need to look at Kubernetes job and specify podAntiAffinity. If you are using HPA, you will need to keep calibrating resource parameters to achieve what you are doing.