I am moving Heroku to Google Cloud Platform and currently I'm running 3 n1-highcpu-4 nodes GKE cluster to test the setup and my application. The cluster was built using default configuration (except for preemptible nodes On). To create Ingres, LB (with NEG balancing) and use Google managed SSL certificate I closely followed these guides:
https://cloud.google.com/kubernetes-engine/docs/concepts/ingress https://cloud.google.com/kubernetes-engine/docs/tutorials/http-balancer https://cloud.google.com/kubernetes-engine/docs/how-to/container-native-load-balancing#create_service
Everything worked flawlessly out-of-box until I started load testing.
Our web application is collecting data from sensors, sent to it over HTTP(S) POST. I used loader.io for testing and this cluster was processing roughly 30k requests per second with under 200ms response time over HTTP. However, when I switched to HTTPS and repeated testing, performance dropped dramatically to less than 4k RPS with 1.7s-4s response time.
I tried to run more pods, add new nodes; send more requests to LB from loader.io; configure keep-alive / idle timeouts; a few kernel tweaks, but non of those helped to step over 4k RPS barriers. Even replacing the app with nginx container and testing its static welcome page had the same results.
Stackdriver GCLB metrics show frontend latency around 10ms and 8s backend latency. However, metrics from my application show that requests take normally between 20ms and 700ms.
ingress.yml
:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: collector-non-prod
annotations:
kubernetes.io/ingress.global-static-ip-name: non-prod-cluster-ip
# kubernetes.io/ingress.allow-http: "true"
networking.gke.io/managed-certificates: "collector-dev,collector-staging"
spec:
rules:
- host: events-dev.xxx.dev
http:
paths:
- backend:
serviceName: collector-dev
servicePort: 80
- host: events-staging.xxx.dev
http:
paths:
- backend:
serviceName: collector-staging
servicePort: 80
service.yml
apiVersion: v1
kind: Service
metadata:
name: collector-dev
labels:
environment: dev
annotations:
cloud.google.com/neg: '{"ingress": true}'
beta.cloud.google.com/backend-config: '{"ports": {"80":"backend-config-non-prod"}}'
spec:
type: NodePort
selector:
app: collector
environment: dev
ports:
- protocol: TCP
port: 80
targetPort: 8080
backend-config.yml
is simply this:
apiVersion: cloud.google.com/v1beta1
kind: BackendConfig
metadata:
name: backend-config-non-prod
spec:
timeoutSec: 60
connectionDraining:
drainingTimeoutSec: 60
I would appreciate much any recommendations on improving HTTPS performance and decreasing requests latency, or helping me to investigate further. I am stuck at the moment :( Thanks!