I have successfully followed the documentation here and here to deploy an API spec and GKE backend to Cloud Endpoints.
This has left me with a deployment.yaml that looks like this:
apiVersion: v1
kind: Service
metadata:
name: esp-myproject
spec:
ports:
- port: 80
targetPort: 8081
protocol: TCP
name: http
selector:
app: esp-myproject
type: LoadBalancer
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: esp-myproject
spec:
replicas: 1
template:
metadata:
labels:
app: esp-myproject
spec:
containers:
- name: esp
image: gcr.io/endpoints-release/endpoints-runtime:1
args: [
"--http_port=8081",
"--backend=127.0.0.1:8080",
"--service=myproject1-0-0.endpoints.myproject.cloud.goog",
"--rollout_strategy=managed",
]
ports:
- containerPort: 8081
- name: myproject
image: gcr.io/myproject/my-image:v0.0.1
ports:
- containerPort: 8080
This creates a single replica of the app on the backend. So far, so good...
I now want to update the yaml file to declaratively specify auto-scaling parameters to enable multiple replicas of the app to run alongside each other when traffic to the endpoint justifies more than one.
I have read around (O'Reilly book: Kubernetes Up & Running, GCP docs, K8s docs), but there are two things on which I'm stumped:
HorizontalPodAutoscaler example (from the docs):
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
Thanks in advance to anyone who can shed some light on this for me.
- I've read a number of times about the HorizontalPodAutoscaler and it's not clear to me whether the deployment must make use of this in order to enjoy the benefits of autoscaling?
Doesn't have to, but it's recommended and it's already built in. You can build your own automation that scales up and down but the question is why since it's already supported with the HPA.
- If so, I have seen examples in the docs of how to define the spec for the HorizontalPodAutoscaler in yaml as shown below - but how would I combine this with my existing deployment.yaml?
It should be straightforward. You basically reference your deployment in the HPA definition:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: my-esp-project-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: esp-myproject <== here
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50