Container Optimized OS performance

9/18/2017

After upgrading my cluster nodes image from CONTAINER_VM to CONTAINER_OPTIMIZED_OS I ran into performance degradation of the PHP Application up to 10 times. Did i miss something in my configuration or its a common issue? I tried to take machines with more CPU and memory but it affected the performance slightly.

Terraform configuration:

resource "google_compute_address" "dev-cluster-address" {
  name   = "dev-cluster-address"
  region = "europe-west1"
}

resource "google_container_cluster" "dev-cluster" {
  name               = "dev-cluster"
  zone               = "europe-west1-d"
  initial_node_count = 2
  node_version       = "1.7.5"

  master_auth {
    username = "*********-dev"
    password = "*********"
  }

  node_config {
    oauth_scopes = [
      "https://www.googleapis.com/auth/monitoring",
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/servicecontrol",
      "https://www.googleapis.com/auth/service.management.readonly",
      "https://www.googleapis.com/auth/devstorage.full_control",
      "https://www.googleapis.com/auth/sqlservice.admin"
    ]
    machine_type = "n1-standard-1"
    disk_size_gb = 20
    image_type = "COS"
  }
}

Kubernetes deployment for Symfony Application:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
    name: deployment-dev
spec:
    replicas: 2
    strategy:
        type: RollingUpdate
        rollingUpdate:
            maxSurge: 1
            maxUnavailable: 1
    template:
        metadata:
            labels:
                app: dev
        spec:
            containers:
                - name: nginx
                  image: nginx:1.13.5-alpine
                  volumeMounts:
                      - name: application
                        mountPath: /var/www/web
                      - name: nginx-config
                        mountPath: /etc/nginx/conf.d
                  ports:
                      - containerPort: 80
                  resources:
                      limits:
                          cpu: "20m"
                          memory: "64M"
                      requests:
                          cpu: "5m"
                          memory: "16M"
                - name: php
                  image: ********
                  lifecycle:
                      postStart:
                          exec:
                              command:
                                  - "bash"
                                  - "/var/www/provision/files/init_php.sh"
                  envFrom:
                      - configMapRef:
                          name: symfony-config-dev
                  volumeMounts:
                      - name: application
                        mountPath: /application
                      - name: logs
                        mountPath: /var/www/var/logs
                      - name: lexik-jwt-keys
                        mountPath: /var/www/var/jwt
                  ports:
                      - containerPort: 9000
                  resources:
                      limits:
                          cpu: "400m"
                          memory: "1536M"
                      requests:
                          cpu: "300m"
                          memory: "1024M"
                - name: cloudsql-proxy-mysql
                  image: gcr.io/cloudsql-docker/gce-proxy:1.09
                  resources:
                      limits:
                          cpu: "10m"
                          memory: "64M"
                      requests:
                          cpu: "5m"
                          memory: "16M"
                  command:
                      - "/cloud_sql_proxy"
                      - "-instances=***:europe-west1:dev1=tcp:0.0.0.0:3306"
                - name: cloudsql-proxy-analytics
                  image: gcr.io/cloudsql-docker/gce-proxy:1.09
                  resources:
                      limits:
                          cpu: "20m"
                          memory: "64M"
                      requests:
                          cpu: "10m"
                          memory: "16M"
                  command:
                      - "/cloud_sql_proxy"
                      - "-instances=***:europe-west1:analytics-dev1=tcp:0.0.0.0:3307"
                - name: sidecar-logging
                  image: alpine:3.6
                  args: [/bin/sh, -c, 'tail -n+1 -f /var/www/var/logs/prod.log']
                  volumeMounts:
                      - name: logs
                        mountPath: /var/www/var/logs
                  resources:
                      limits:
                          cpu: "5m"
                          memory: "20M"
                      requests:
                          cpu: "5m"
                          memory: "20M"
            volumes:
                - name: application
                  emptyDir: {}
                - name: logs
                  emptyDir: {}
                - name: nginx-config
                  configMap:
                      name: config-dev
                      items:
                          - key: nginx
                            path: default.conf
                - name: lexik-jwt-keys
                  configMap:
                      name: config-dev
                      items:
                          - key: lexik_jwt_private_key
                            path: private.pem
                          - key: lexik_jwt_public_key
                            path: public.pem
-- kindratmakc
google-container-os
google-kubernetes-engine
kubernetes

1 Answer

9/19/2017

One of the reasons could be the fact that Kubernetes actually started enforcing the CPU limits with Container-Optimized OS.

resources:
    limits:
        cpu: "20m"

These were not enforced on the older ContainerVM images.

Could you please try removing/relaxing cpu limits from your pod-spec and see if it helps?

-- Aditya K.
Source: StackOverflow