API auth error connecting Prometheus to Kubernetes (Openshift Origin)

11/23/2017

I have a Kubernetes cluster (Openshift Origin v3.6) and Prometheus (v1.8.1) running in two separate servers. I am trying to monitor Kubernetes with Prometheus.

I got the default service account token and put it on /etc/prometheus/token.

oc sa get-token default

Then added this to Prometheus configuration file:

...
  - job_name: 'kubernetes-apiservers'
    kubernetes_sd_configs:
      - role: endpoints
        api_server: 'https://my.kubernetes.master:8443'
    scheme: https
    bearer_token_file: /etc/prometheus/token
    relabel_configs:
      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
        action: keep
        regex: default;kubernetes;https
...

After restarting prometheus, I see following error log repeating over and over again:

Nov 23 22:43:05 prometheus prometheus[17830]: time="2017-11-23T22:43:05Z" level=error msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:183: Failed to list *v1.Pod: User "system:anonymous" cannot list all pods in the cluster" component="kube_client_runtime" source="kubernetes.go:76"

I found this here:

If no access token or certificate is presented, the authentication layer assigns the system:anonymous virtual user and the system:unauthenticated virtual group to the request. This allows the authorization layer to determine which requests, if any, an anonymous user is allowed to make.

I believe my configuration is wrong somewhere, and Prometheus is not using the token to authenticate.

So, what's wrong with my setup and how could I fix it?. Thanks in advance.

-- greenLED
kubernetes
openshift
openshift-origin
prometheus

1 Answer

11/24/2017

Let's begin with Authentication, As you have provided Prometheus with the default service account token which means It's authenticated normally. API Server knows who it is.

Now, Authorization is giving you the problem here.As you can see here

"system:anonymous" cannot list all pods in the cluster"

It means authenticated service account does not have capability or permission to execute this operation therefore, you can not do it.

Solution to your problem

  • Check if there is a suitable clusterRole available for Prometheus. As Prometheus needs to have cluster-wide permission to operate its task. If not create a clusterRole.
  • Check if there is a clusterRoleBinding for default Service Account. If not create a clusterRoleBinding.

I have attached a link for further reading RBAC

-- Suresh Vishnoi
Source: StackOverflow