Pod load distribution in Kubernetes

4/22/2020

I have a service in Kubernetes that receives Http requests to create users,

Only with 1 pod, it correctly reaches 100 requests per minute, after this, it has latencies, the point is that if you hold 100 requests with 1 pod, should you keep 500 requests per minute with 5 pods?

Because even with 10 pods, when exceeding 100 orders per minute, dont correctly distributed the load and appears latency in the services.

The default load configuration I understand is round robin, the problem is that I see that the ram increases only in one of the pods and does not distribute the load correctly.

This is my service yaml deploy and my HPA yaml.

Deploy Yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: create-user-service
  labels:
    app: create-user-service
spec:
  replicas: 1
  selector:
    matchLabels:
      app: create-user-service
  template:
    metadata:
      labels:
        app: create-user-service
    spec:
      volumes:
        - name: key
          secret:
            secretName: my-secret-key
      containers:
        ### [LISTPARTY CONTAINER]
        - name: create-user-service
          image: docker/user-create/create-user-service:0.0.1
          volumeMounts:
            - name: key
              mountPath: /var/secrets/key
          ports:
            - containerPort: 8080
          env:
            - name: PORT
              value: "8080"
          resources:
            limits:
              cpu: "2.5"
              memory: 6Gi
            requests:
              cpu: "1.5"
              memory: 5Gi
          livenessProbe:    ## is healthy
            failureThreshold: 3
            httpGet:
              path: /healthcheck/livenessprobe
              port: 8080
              scheme: HTTP
            initialDelaySeconds: 30
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: create-user-service
spec:
  ports:
    - port: 8080
      targetPort: 8080
      protocol: TCP
      name: http
  selector:
    app: create-user-service
  type: NodePort

HPA Yaml

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: create-user-service
spec:
  maxReplicas: 10
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: create-user-service
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 75
  - type: Resource
    resource:
      name: memory
      targetAverageUtilization: 75
  - external:
      metricName: serviceruntime.googleapis.com|api|request_count
      metricSelector:
        matchLabels:
          resource.type: api
          resource.labels.service: create-user-service.endpoints.user-create.cloud.goog
      targetAverageValue: "3"
    type: External

What may be happend?.

Thank you all.

-- Elias Vargas
google-cloud-platform
google-kubernetes-engine
kubernetes
kubernetes-pod
scalability

0 Answers