ImagePullBackOff on GKE with Private Google Cloud Repository

8/10/2019

I am creating a deployment in GKE with a following (standard) deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      component: api
  template:
    metadata:
      labels:
        component: api
    spec:
      containers:
      - name: api
        image: eu.gcr.io/xxxx-xxx/api:latest
        imagePullPolicy: Always
        resources:
          requests:
            memory: "320Mi"
            cpu: "100m"
          limits:
            memory: "450Mi"
            cpu: "150m"
        ports:
        - containerPort: 5010

However, for some reason GKE complains about a permission issue. The containers are in container registry of the same project and PRIVATE, but as far as I am aware if it with a GCP project GKE should be able to have access. The GKE cluster is vpc-native (if that might make a difference) as that is the only difference I can think of compared a cluster I used to run with the same containers and installers.

Events:
  Type     Reason     Age                    From                                                     Message
  ----     ------     ----                   ----                                                     -------
  Normal   Scheduled  34m                    default-scheduler                                        Successfully assigned default/api-deployment-f68977b84-fmhdx to gke-gke-dev-cluster-default-pool-6c6bb127-nw61
  Normal   Pulling    32m (x4 over 33m)      kubelet, gke-gke-dev-cluster-default-pool-6c6bb127-nw61  pulling image "eu.gcr.io/xxxx-xxx/api:latest"
  Warning  Failed     32m (x4 over 33m)      kubelet, gke-gke-dev-cluster-default-pool-6c6bb127-nw61  Failed to pull image "eu.gcr.io/xxxx-xxx/api:latest": rpc error: code = Unknown desc = Error response from daemon: pull access denied for eu.gcr.io/xxxx-xxx/api, repository does not exist or may require 'docker login'
  Warning  Failed     32m (x4 over 33m)      kubelet, gke-gke-dev-cluster-default-pool-6c6bb127-nw61  Error: ErrImagePull
  Normal   BackOff    32m (x6 over 33m)      kubelet, gke-gke-dev-cluster-default-pool-6c6bb127-nw61  Back-off pulling image "eu.gcr.io/xxxx-xxx/api:latest"
  Warning  Failed     3m59s (x131 over 33m)  kubelet, gke-gke-dev-cluster-default-pool-6c6bb127-nw61  Error: ImagePullBackOff

Do I need to add ImageSecrets as well for GKE clusters with the google cloud repository or might there be another problem?

The GKE cluster was created using TerraForm with the following gke.tf for GKE

resource "google_container_cluster" "primary" {
  name = "gke-${terraform.workspace}-cluster"
  zone = "${var.region}-b"

  additional_zones = [
    "${var.region}-c",
    "${var.region}-d",
  ]

  # minimum kubernetes version for master
  min_master_version = "${var.min_master_version}"
  # version for the nodes. Should equal min_master_version on create
  node_version       = "${var.node_version}"
  initial_node_count = "${var.gke_num_nodes[terraform.workspace]}"
  network            = "${var.vpc_name}"
  subnetwork         = "${var.subnet_name}"

  addons_config {

    http_load_balancing {
      disabled = false  # this is the default
    }

    horizontal_pod_autoscaling {
      disabled = false
    }

    kubernetes_dashboard {
      disabled = false
    }
  }

  # vpc-native network
  ip_allocation_policy {
#    use_ip_aliases = true
  }

  master_auth {
    username = "${var.gke_master_user}"
    password = "${var.gke_master_pass}"
  }

  node_config {
    oauth_scopes = [
      "https://www.googleapis.com/auth/compute",
      "https://www.googleapis.com/auth/devstorage.read_only",
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]

    labels = {
      env = "${var.gke_label[terraform.workspace]}"
    }

    disk_size_gb = 10
    machine_type = "${var.gke_node_machine_type}"
    tags         = ["gke-node"]
  }
}

running gcloud gcloud container clusters describe [CLUSTER] gives

nodePools:
- config:
    diskSizeGb: 10
    diskType: pd-standard
    imageType: COS
    labels:
      env: dev
    machineType: n1-standard-1
    metadata:
      disable-legacy-endpoints: 'true'
    oauthScopes:
    - https://www.googleapis.com/auth/monitoring
    - https://www.googleapis.com/auth/devstorage.read_only
    - https://www.googleapis.com/auth/logging.write
    - https://www.googleapis.com/auth/compute
    serviceAccount: default

so devstorage.read_only seems to be there

-- Mike
google-cloud-platform
google-kubernetes-engine
terraform-provider-gcp

3 Answers

8/14/2019

On addition to Aleksi comment and based on this documentation [1], you can also retrieve IAM policies for an individual service-account with:

gcloud iam service-accounts get-iam-policy [SERVICE_ACCOUNT]
-- Bruno
Source: StackOverflow

8/12/2019

Are your GKE cluster node pools configured with the https://www.googleapis.com/auth/devstorage.read_only OAuth scope?

To check you can run gcloud container clusters describe [CLUSTER NAME]: scopes are listed under the oauthScopes property. Or check your node pool details at the GCP dashboard:

GKE node pool OAuth scopes

Storage should be enabled.

-- Aleksi
Source: StackOverflow

8/11/2019

In order to use GCR, the nodes will need to run with service accounts and OAuth scopes that allow reading from cloud storage. There's some guidance for this topic here for example: https://cloud.google.com/kubernetes-engine/docs/how-to/access-scopes#service_account

-- mensi
Source: StackOverflow