How to improve Google Load Balancer performance with GKE over HTTPS?

10/31/2019

I am moving Heroku to Google Cloud Platform and currently I'm running 3 n1-highcpu-4 nodes GKE cluster to test the setup and my application. The cluster was built using default configuration (except for preemptible nodes On). To create Ingres, LB (with NEG balancing) and use Google managed SSL certificate I closely followed these guides:

https://cloud.google.com/kubernetes-engine/docs/concepts/ingress https://cloud.google.com/kubernetes-engine/docs/tutorials/http-balancer https://cloud.google.com/kubernetes-engine/docs/how-to/container-native-load-balancing#create_service

Everything worked flawlessly out-of-box until I started load testing.

Our web application is collecting data from sensors, sent to it over HTTP(S) POST. I used loader.io for testing and this cluster was processing roughly 30k requests per second with under 200ms response time over HTTP. However, when I switched to HTTPS and repeated testing, performance dropped dramatically to less than 4k RPS with 1.7s-4s response time.

I tried to run more pods, add new nodes; send more requests to LB from loader.io; configure keep-alive / idle timeouts; a few kernel tweaks, but non of those helped to step over 4k RPS barriers. Even replacing the app with nginx container and testing its static welcome page had the same results.

Stackdriver GCLB metrics show frontend latency around 10ms and 8s backend latency. However, metrics from my application show that requests take normally between 20ms and 700ms.

ingress.yml:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: collector-non-prod
  annotations:
    kubernetes.io/ingress.global-static-ip-name: non-prod-cluster-ip
#    kubernetes.io/ingress.allow-http: "true"
    networking.gke.io/managed-certificates: "collector-dev,collector-staging"
spec:
  rules:
    - host: events-dev.xxx.dev
      http:
        paths:
          - backend:
              serviceName: collector-dev
              servicePort: 80
    - host: events-staging.xxx.dev
      http:
        paths:
          - backend:
              serviceName: collector-staging
              servicePort: 80

service.yml

apiVersion: v1
kind: Service
metadata:
  name: collector-dev
  labels:
    environment: dev
  annotations:
    cloud.google.com/neg: '{"ingress": true}'
    beta.cloud.google.com/backend-config: '{"ports": {"80":"backend-config-non-prod"}}'
spec:
  type: NodePort
  selector:
    app: collector
    environment: dev
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

backend-config.yml is simply this:

apiVersion: cloud.google.com/v1beta1
kind: BackendConfig
metadata:
  name: backend-config-non-prod
spec:
  timeoutSec: 60
  connectionDraining:
    drainingTimeoutSec: 60

I would appreciate much any recommendations on improving HTTPS performance and decreasing requests latency, or helping me to investigate further. I am stuck at the moment :( Thanks!

-- romanko
google-cloud-load-balancer
google-cloud-platform
google-kubernetes-engine

0 Answers