New pods not used after horizontal pod scaling in Google Kubernetes Engine

4/17/2020

I have a deployment with a web service which I am testing using K6. Horizontal pod autoscaling is enabled with min 1 and max 4 replicas. When I start the load test, the autoscaler scales up to the 4 replicas, but only the pod, which has already been available before autoscaling receives traffic. However, I can observe that all 4 pods are up and running.

The ClusterIP service serving the deployment has sessonAffinity set to None. The cpu request is 0.5 and the cpu limit is 1.3. The target average utilization of the hpa is 50. In the tests, CPU of the one pod serving traffic goes up to or slightly above the limit, the other pods show no cpu utilization.

However, when multiple pdos are available at test start, the load is distributed between these pods (even though not evenly). As shown in the figure 1, two tests were run in a row. The number of requests served by each pod in 5s aggregates is illustrated. In the first test, autoscaler scaled up to 4 pods, but only one pod received requests. In the second test, all 4 Pods were still there from the upscaling in test 1 and all received requests. figure 1

I also observed that when increasing traffic in the load test, some of the new pods receive some requests, as shown in traffic 2, where two load peaks have been added to the test. figure 2

Still, I want the requests to be distributed between all pods evenly, as soon as the pods are created due to autoscaling. What could be wrong?

Here the configuration:

deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: td-app
  namespace: crypto
spec:
  replicas: 1
  selector:
    matchLabels:
      app: td-app
  template:
    metadata:
      labels:
        app: td-app
    spec:
      containers:
      - env:
        - name: HTPASSWDFILE
        - name: KEYCATALOGUE
          value: /home/app/keys.json
        image: eu.gcr.io/td-cluster/td:2.3.0-2020457-a92b94
        imagePullPolicy: IfNotPresent
        name: td-app
        ports:
        - containerPort: 8080
          name: main
          protocol: TCP
        resources:
          limits:
            cpu: 1.3
            memory: 200Mi
          requests:
            cpu: 0.5
            memory: 50Mi
        securityContext:
          runAsNonRoot: true
          runAsUser: 1000
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /home/app/masterkeys
          name: masterkeys
      volumes:
      - name: masterkeys
        secret:
          defaultMode: 420
          secretName: masterkeys

service:

apiVersion: v1
kind: Service
metadata:
  labels:
    app: td-app
  name: td-app-service
  namespace: crypto
spec:
  ports:
  - port: 8080
    protocol: TCP
    targetPort: 8080
  selector:
    app: td-app
  sessionAffinity: None
  type: ClusterIP

hpa:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: td-app
  namespace: crypto
  labels:
    app: td-app
spec:
  maxReplicas: 4
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: td-app
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 50

For the tests K6.io is used. For 6 minutes, up to 12 virtual users send http POST requests to the service iteratively. The test runs as a job in the same cluster using the service name to send requests.

-- T0bz
google-kubernetes-engine
kubernetes

0 Answers