prometheus to monitor eks (aws) node

5/15/2019

I am trying to setup prometheus to monitor kubernetes node (eks, Version 1.10) with following scrape_configs

  - job_name: 'kubernetes-nodes'
    kubernetes_sd_configs:
    - api_servers:
      - 'https://kubernetes.default.svc.cluster.local'
      in_cluster: true
      role: node

    tls_config:
      insecure_skip_verify: true

    relabel_configs:
    - target_label: __scheme__
      replacement: https

node-exporter :

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: kube-system
  labels:
    name: node-exporter
spec:
  template:
    metadata:
      labels:
        name: node-exporter
      annotations:
         prometheus.io/scrape: "true"
         prometheus.io/port: "9100"
    spec:
      hostPID: true
      hostIPC: true
      hostNetwork: true
      containers:
        - ports:
            - containerPort: 9100
              protocol: TCP
          resources:
            requests:
              cpu: 0.15
          securityContext:
            privileged: true
          image: prom/node-exporter:v0.15.2
          args:
            - --path.procfs
            - /host/proc
            - --path.sysfs
            - /host/sys
            - --collector.filesystem.ignored-mount-points
            - '"^/(sys|proc|dev|host|etc)($|/)"'
          name: node-exporter
          volumeMounts:
            - name: dev
              mountPath: /host/dev
            - name: proc
              mountPath: /host/proc
            - name: sys
              mountPath: /host/sys
            - name: rootfs
              mountPath: /rootfs
      volumes:
        - name: proc
          hostPath:
            path: /proc
        - name: dev
          hostPath:
            path: /dev
        - name: sys
          hostPath:
            path: /sys
        - name: rootfs
          hostPath:
            path: /

With this config prometheus pod fails with :

level=error ts=2019-05-15T15:18:45.472Z caller=main.go:717 
err="error loading config from \"/etc/prometheus/prometheus.yml\": 
couldn't load configuration (--config.file=\"/etc/prometheus/prometheus.yml\"): parsing YAML file 
/etc/prometheus/prometheus.yml: yaml: unmarshal errors:\n  line 33: field api_servers not found in type kubernetes.plain\n  
line 35: field in_cluster not found in type kubernetes.plain"

Update :

Corrected scrape_configs to

  - job_name: 'kubernetes-apiservers'
    kubernetes_sd_configs:
    - role: endpoints
    scheme: http
    relabel_configs:
    - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: default;kubernetes;https

  - job_name: 'kubernetes-nodes'
    scheme: http
    kubernetes_sd_configs:
    - role: node
    relabel_configs:
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.+)
    - target_label: __address__
      replacement: kubernetes.default.svc:443
    - source_labels: [__meta_kubernetes_node_name]
      regex: (.+)
      target_label: __metrics_path__
      replacement: /api/v1/nodes/${1}/proxy/metrics

node metric not showing up in the prometheus UI

-- roy
amazon-eks
aws-eks
kubernetes
prometheus

1 Answer

5/16/2019

First, what's wrong with just omitting the api server since that's the default behavior anyway? You're not customizing it, you're just making error messages

Second, what's wrong with reading the fine manual which clearly says api_server: not plural (what would it even mean to have multiple of them?!)

Third, there are so many mechanisms to install a working prometheus, why not learn from what they have to offer, even if you don't end up using them?

-- mdaniel
Source: StackOverflow