Autoscaling a google Cloud-Endpoints backend deployment declaratively (in the yaml)?

12/4/2018

I have successfully followed the documentation here and here to deploy an API spec and GKE backend to Cloud Endpoints.

This has left me with a deployment.yaml that looks like this:

apiVersion: v1
kind: Service
metadata:
  name: esp-myproject
spec:
  ports:
  - port: 80
    targetPort: 8081
    protocol: TCP
    name: http
  selector:
    app: esp-myproject
  type: LoadBalancer
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: esp-myproject
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: esp-myproject
    spec:
      containers:
      - name: esp
        image: gcr.io/endpoints-release/endpoints-runtime:1
        args: [
          "--http_port=8081",
          "--backend=127.0.0.1:8080",
          "--service=myproject1-0-0.endpoints.myproject.cloud.goog",
          "--rollout_strategy=managed",
        ]
        ports:
        - containerPort: 8081
      - name: myproject
        image: gcr.io/myproject/my-image:v0.0.1
        ports:
        - containerPort: 8080

This creates a single replica of the app on the backend. So far, so good...

I now want to update the yaml file to declaratively specify auto-scaling parameters to enable multiple replicas of the app to run alongside each other when traffic to the endpoint justifies more than one.

I have read around (O'Reilly book: Kubernetes Up & Running, GCP docs, K8s docs), but there are two things on which I'm stumped:

  1. I've read a number of times about the HorizontalPodAutoscaler and it's not clear to me whether the deployment must make use of this in order to enjoy the benefits of autoscaling?
  2. If so, I have seen examples in the docs of how to define the spec for the HorizontalPodAutoscaler in yaml as shown below - but how would I combine this with my existing deployment.yaml?

HorizontalPodAutoscaler example (from the docs):

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

Thanks in advance to anyone who can shed some light on this for me.

-- UpstatePedro
autoscaling
google-cloud-endpoints
google-cloud-platform
kubernetes

1 Answer

12/4/2018
  1. I've read a number of times about the HorizontalPodAutoscaler and it's not clear to me whether the deployment must make use of this in order to enjoy the benefits of autoscaling?

Doesn't have to, but it's recommended and it's already built in. You can build your own automation that scales up and down but the question is why since it's already supported with the HPA.

  1. If so, I have seen examples in the docs of how to define the spec for the HorizontalPodAutoscaler in yaml as shown below - but how would I combine this with my existing deployment.yaml?

It should be straightforward. You basically reference your deployment in the HPA definition:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: my-esp-project-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: esp-myproject  <== here
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
-- Rico
Source: StackOverflow