I have a clean AKS cluster that I deployed the prometheus-operator chart. The Grafana pod is showing a ton of restarts. My cluster version is 1.11.3. Grafana logs below. Anyone else encounter this issue?
File in configmap grafana-dashboard-k8s-node-rsrc-use.json ADDED
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/urllib3/response.py", line 543, in _update_chunk_length
self.chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/urllib3/response.py", line 302, in _error_catcher
yield
File "/usr/local/lib/python3.6/site-packages/urllib3/response.py", line 598, in read_chunked
self._update_chunk_length()
File "/usr/local/lib/python3.6/site-packages/urllib3/response.py", line 547, in _update_chunk_length
raise httplib.IncompleteRead(line)
http.client.IncompleteRead: IncompleteRead(0 bytes read)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/sidecar.py", line 58, in <module>
main()
File "/app/sidecar.py", line 54, in main
watchForChanges(label, targetFolder)
File "/app/sidecar.py", line 23, in watchForChanges
for event in w.stream(v1.list_config_map_for_all_namespaces):
File "/usr/local/lib/python3.6/site-packages/kubernetes/watch/watch.py", line 124, in stream
for line in iter_resp_lines(resp):
File "/usr/local/lib/python3.6/site-packages/kubernetes/watch/watch.py", line 45, in iter_resp_lines
for seg in resp.read_chunked(decode_content=False):
File "/usr/local/lib/python3.6/site-packages/urllib3/response.py", line 626, in read_chunked
self._original_response.close()
File "/usr/local/lib/python3.6/contextlib.py", line 99, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.6/site-packages/urllib3/response.py", line 320, in _error_catcher
raise ProtocolError('Connection broken: %r' % e, e)
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
Based on the Prometheus operator repository... The sidecar container on the Grafana pod is failing to contact Grafana and reload/refresh the dashboards defined on the configmap being watched.
So the this is a symptom of the Grafana container failing... can you check Grafana container inside your Grafana pod logs?
This can be fixed by updating to a more recent version of the side-car container since it's a known bug there that got fixed