Query on custom metrics exposed via prometheus node exporter textfile collector fails

6/9/2018

I am new to prometheus/alertmanager.

I have created a cron job which executes shell script every minute. This shell script generates "test.prom" file (with a gauge metric in it) in the same directory which is assigned to --textfile.collector.directory argument (to node-exporter). I verified (using curl http://localhost:9100/metrics) that the node-exporter exposes that custom metric correctly.

When I tried to run a query against that custom metric in prometheus dashboard, it does not show up any results (it says no data found).

I could not figure out why the query against the metric exposed via node-exporter textfile collector fails. Any clues what I missed ? Also please let me know how to check and ensure that prometheus scraped my custom metric 'test_metric` ?

My query in prometheus dashboard is test_metric != 0 (in prometheus dashboard) which did not give any results. But I exposed test_metric via node-exporter textfile.

Any help is appreciated !!

BTW, the node-exporter is running as docker container in Kubernetes environment.

-- user_2011
docker
kubernetes
prometheus
prometheus-alertmanager
promql

3 Answers

6/13/2018

Its my bad. I did not included scrape instructions for node-exporter in prometheus.yaml file. It worked after including them.

-- user_2011
Source: StackOverflow

11/5/2018

I had a similar situation, but it was not a configuration problem.

Instead, my data included timestamps:

# HELP network_connectivity_rtt Round Trip Time to each node
# TYPE network_connectivity_rtt gauge
network_connectivity_rtt{host="home"} 53.87 1541426242
network_connectivity_rtt{host="hop_1"} 58.8 1541426242
network_connectivity_rtt{host="hop_2"} 21.93 1541426242
network_connectivity_rtt{host="hop_3"} 71.69 1541426242

PNE was picking them up without any problem once I reloaded it. As prometheus is running under systemd, I had to check the logs like this:

journalctl --system -u prometheus.service --follow

There I read this line:

msg="Error on ingesting samples that are too old or are too far into the future"

Once I removed the timestamps, values started appearing. This lead me to read more in detail about the timestamps, and I found out they have to be in miliseconds. So this format now is ok:

# HELP network_connectivity_rtt Round Trip Time to each node
# TYPE network_connectivity_rtt gauge
network_connectivity_rtt{host="home"} 50.47 1541429581376
network_connectivity_rtt{host="hop_1"} 3.38 1541429581376
network_connectivity_rtt{host="hop_2"} 11.2 1541429581376
network_connectivity_rtt{host="hop_3"} 20.72 1541429581376

I hope it helps someone else.

-- Marcos Dione
Source: StackOverflow

2/26/2020

This issue is happening because of stale metrics. Lets say you have written you metric in file at 13.00 by default after 5min prometheus will consider you metric stale and it might disappear from there at the time you are making query.

-- Dinesh Kumar
Source: StackOverflow