Authenticating to Google Cloud Firestore from GKE with Workload Identity

6/29/2020

I'm trying to write a simple backend that will access my Google Cloud Firestore, it lives in the Google Kubernetes Engine. On my local I'm using the following code to authenticate to Firestore as detailed in the Google Documentation.

if (process.env.NODE_ENV !== 'production') {
  const result = require('dotenv').config()
  //Additional error handling here
}

This pulls the GOOGLE_APPLICATION_CREDENTIALS environment variable and populates it with my google-application-credentals.json which I got from creating a service account with the "Cloud Datastore User" role.

So, locally, my code runs fine. I can reach my Firestore and do everything I need to. However, the problem arises once I deploy to GKE.

I followed this Google Documentation to set up a Workload Identity for my cluster, I've created a deployment and verified that the pods all are using the correct IAM Service Account by running:

kubectl exec -it POD_NAME -c CONTAINER_NAME -n NAMESPACE sh
> gcloud auth list

I was under the impression from the documentation that authentication would be handled for my service as long as the above held true. I'm really not sure why but my Firestore() instance is behaving as if it does not have the necessary credentials to access the Firestore.

In case it helps below is my declaration and implementation of the instance:

const firestore = new Firestore()

const server = new ApolloServer({
  schema: schema,
  dataSources: () => {
    return {
      userDatasource: new UserDatasource(firestore)
    }
  }
})

UPDATE:

In a bout of desperation I decided to tear down everything and re-build it. Following everything over step by step I appear to have either encountered a bug or (more likely) I did something mildly wrong the first time. I'm now able to connect to my backend service. However, I'm now getting a different error. Upon sending any request (I'm using GraphQL, but in essence it's any REST call) I get back a 404.

Inspecting the logs yields the following:

'Getting metadata from plugin failed with error: Could not refresh access token: A Not Found error was returned while attempting to retrieve an accesstoken for the Compute Engine built-in service account. This may be because the Compute Engine instance does not have any permission scopes specified: Could not refresh access token: Unsuccessful response status code. Request failed with status code 404'

A cursory search for this issue doesn't seem to return anything related to what I'm trying to accomplish, and so I'm back to square one.

-- James Williams
google-cloud-firestore
google-iam
google-kubernetes-engine
kubernetes
node.js

2 Answers

7/10/2020

I think your initial assumption was correct! Workload Identity is not functioning properly if you still have to specify scopes. In the Workload article you have linked, scopes are not used.

I've been struggling with the same issue and have identified three ways to get authenticated credentials in the pod.


1. Workload Identity (basically the Workload Identity article above with some deployment details added)

This method is preferred because it allows each pod deployment in a cluster to be granted only the permissions it needs.

Create cluster (note: no scopes or service account defined)

gcloud beta container clusters create {cluster-name} \
  --release-channel regular \
  --identity-namespace {projectID}.svc.id.goog

Then create the k8sServiceAccount, assign roles, and annotate.

gcloud container clusters get-credentials {cluster-name}

kubectl create serviceaccount --namespace default {k8sServiceAccount}

gcloud iam service-accounts add-iam-policy-binding \
  --member serviceAccount:{projectID}.svc.id.goog[default/{k8sServiceAccount}] \
  --role roles/iam.workloadIdentityUser \
  {googleServiceAccount}

kubectl annotate serviceaccount \
  --namespace default \
  {k8sServiceAccount} \
  iam.gke.io/gcp-service-account={googleServiceAccount}

Then I create my deployment, and set the k8sServiceAccount. (Setting the service account was the part that I was missing)

kubectl create deployment {deployment-name} --image={containerImageURL}
kubectl set serviceaccount deployment {deployment-name} {k8sServiceAccount}

Then expose with a target of 8080

kubectl expose deployment {deployment-name}  --name={service-name} --type=LoadBalancer --port 80 --target-port 8080

The googleServiceAccount needs to have the appropriate IAM roles assigned (see below).


2. Cluster Service Account

This method is not preferred, because all VMs and pods in the cluster will have permissions based on the defined service account.

Create cluster with assigned service account

gcloud beta container clusters create [cluster-name] \
 --release-channel regular \
 --service-account {googleServiceAccount}

The googleServiceAccount needs to have the appropriate IAM roles assigned (see below).

Then deploy and expose as above, but without setting the k8sServiceAccount


3. Scopes

This method is not preferred, because all VMs and pods in the cluster will have permisions based on the scopes defined.

Create cluster with assigned scopes (firestore only requires "cloud-platform", realtime database also requires "userinfo.email")

gcloud beta container clusters create $2 \
  --release-channel regular \
  --scopes https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/userinfo.email

Then deploy and expose as above, but without setting the k8sServiceAccount


The first two methods require a Google Service Account with the appropriate IAM roles assigned. Here are the roles I assigned to get a few Firebase products working:

  • FireStore: Cloud Datastore User (Datastore)
  • Realtime Database: Firebase Realtime Database Admin (Firebase Products)
  • Storage: Storage Object Admin (Cloud Storage)
-- CrispyDyne
Source: StackOverflow

6/30/2020

Going to close this question.

Just in case anyone stumbles onto it here's what fixed it for me.

1.) I re-followed the steps in the Google Documentation link above, this fixed the issue of my pods not launching.

2.) As for my update, I re-created my cluster and gave it the Cloud Datasource permission. I had assumed that the permissions were seperate from what Workload Identity needed to function. I was wrong.

I hope this helps someone.

-- James Williams
Source: StackOverflow