cannot query kubernetes (unauthorized): endpoints is forbidden: User cannot list endpoints in the namespace

4/16/2018

I am running kubernetes 1.9.4 on my gke cluster

I have two pods , gate which is trying to connect to coolapp, both written in elixir

I am using libcluster to connect my nodes I get the following error:

[libcluster:app_name] cannot query kubernetes (unauthorized): endpoints is forbidden: User "system:serviceaccount:staging:default" cannot list endpoints in the namespace "staging": Unknown user "system:serviceaccount:staging:default"

here is my config in gate under config/prod:

 config :libcluster,
 topologies: [
   app_name: [
     strategy: Cluster.Strategy.Kubernetes,
     config: [
       kubernetes_selector: "tier=backend",
       kubernetes_node_basename: System.get_env("MY_POD_NAMESPACE") || "${MY_POD_NAMESPACE}"]]]

here is my configuration:

vm-args

## Name of the node
-name ${MY_POD_NAMESPACE}@${MY_POD_IP}
## Cookie for distributed erlang
-setcookie ${ERLANG_COOKIE}
# Enable SMP automatically based on availability
-smp auto

creating the secrets:

kubectl create secret generic erlang-config --namespace staging --from-literal=erlang-cookie=xxxxxx
kubectl create configmap vm-config --namespace staging --from-file=vm.args

gate/deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: gate
  namespace: staging
spec:
  replicas: 1
  revisionHistoryLimit: 1
  strategy:
      type: RollingUpdate
  template:
    metadata:
      labels:
        app: gate
        tier: backend
    spec:
      securityContext:
        runAsUser: 0
        runAsNonRoot: false
      containers:
      - name: gate
        image: gcr.io/development/gate:0.1.7
        args:
          - foreground
        ports:
        - containerPort: 80
        volumeMounts:
        - name: config-volume
          mountPath: /beamconfig
        env:
        - name: MY_POD_NAMESPACE
          value: staging
        - name: MY_POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: MY_POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: RELEASE_CONFIG_DIR
          value: /beamconfig
        - name: ERLANG_COOKIE
          valueFrom:
            secretKeyRef:
              name: erlang-config
              key: erlang-cookie
      volumes:
      - name: config-volume
        configMap:
          name: vm-config

coolapp/deployment.yaml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: coolapp
  namespace: staging
spec:
  replicas: 1
  revisionHistoryLimit: 1
  strategy:
      type: RollingUpdate
  template:
    metadata:
      labels:
        app: coolapp
        tier: backend
    spec:
      securityContext:
        runAsUser: 0
        runAsNonRoot: false
     # volumes
      volumes:
      - name: config-volume
        configMap:
          name: vm-config
      containers:
      - name: coolapp
        image: gcr.io/development/coolapp:1.0.3
        volumeMounts:
        - name: secrets-volume
          mountPath: /secrets
          readOnly: true
        - name: config-volume
          mountPath: /beamconfig
        ports:
        - containerPort: 80
        args:
          - "foreground"
        env:
        - name: MY_POD_NAMESPACE
          value: staging
        - name: MY_POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: MY_POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: REPLACE_OS_VARS
          value: "true"
        - name: RELEASE_CONFIG_DIR
          value: /beamconfig
        - name: ERLANG_COOKIE
          valueFrom:
            secretKeyRef:
              name: erlang-config
              key: erlang-cookie
        # proxy_container
      - name: cloudsql-proxy
        image: gcr.io/cloudsql-docker/gce-proxy:1.11
        command: ["/cloud_sql_proxy", "--dir=/cloudsql",
            "-instances=staging:us-central1:com-staging=tcp:5432",
            "-credential_file=/secrets/cloudsql/credentials.json"]
        volumeMounts:
          - name: cloudsql-instance-credentials
            mountPath: /secrets/cloudsql
            readOnly: true
          - name: cloudsql
            mountPath: /cloudsql
-- dina
elixir
erlang
google-kubernetes-engine
kubernetes

2 Answers

4/16/2018

The default service account for the staging namespace (in which apparently your Pods using libcluster are running) lacks RBAC permissions to get endpoints in that namespace.

Likely your application requires a number of other permissions (that are not mentioned in the above error message) to work correctly; identifying all such permissions is out of scope for SO.

A way to resolve this issue is to grant superuser permissions that service account. This is not a secure solution but a stop gap fix.

$ kubectl create clusterrolebinding make-staging-sa-cluster-admin \
    --serviceaccount=staging:default \
    --clusterrole=cluster-admin

clusterrolebinding "make-staging-sa-cluster-admin" created

To grant the specific permission only (get endpoints in the staging namespace) you would need to create a Role first:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: some-permissions
  namespace: staging
rules:
- apiGroups: [""]
  resources: ["endpoints"]
  verbs: ["get", "list", "watch"]

And create a RoleBinding for the default service account in the staging namespace:

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: give-default-sa-some-permissions
  namespace: staging
subjects:
- kind: ServiceAccount
  name: default
  namespace: staging
roleRef:
  kind: Role
  name: some-permissions
  apiGroup: rbac.authorization.k8s.io
-- Janos Lenart
Source: StackOverflow

4/16/2018

Not an erlang/elixir or libcluster user, but it seems it is trying to use the default service account for the namespace to try and query the master for a list of endpoints available in the cluster.

The readme for libcluster says as much:

If set to Cluster.Strategy.Kubernetes, it will use the Kubernetes API to query endpoints based on a basename and label selector, using the token and namespace injected into every pod; once it has a list of endpoints, it uses that list to form a cluster, and keep it up to date.

Reading the code for the kubernetes.ex in libcluster and the error you get confirm as much.

You will need to setup a ClusterRole and RoleBinding for the service account in the staging namespace. This will allow libcluster to dynamically query the master to discover other erlang nodes in the cluster/namespace.

Here are some handy resources for follow up reading:

-- ffledgling
Source: StackOverflow