Setting HTTP request settings on KubernetesPods

12/7/2018

I'm running Apache Airflow on Kubernetes and running into a strange error when trying to pull log files.

*** Failed to fetch log file from worker. HTTPConnectionPool(host='geometrical-galaxy-7364-worker-0.geometrical-galaxy-7364-worker.astronomer-geometrical-galaxy-7364.svc.cluster.local', port=8793): Max retries exceeded with url: /log/FILE/begin/2018-12-06T00:00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7e86dab7b8>: Failed to establish a new connection: [Errno 111] Connection refused',))

It looks to me like there are too many requests being made on the stateful set (if I jump into the pod that holds the log files they are all there, but they don't get pulled into the UI that's trying to pull them).

Is there somewhere that a limit for HTTP requests for a stateful set or a pod gets set?

-- Viraj Parekh
airflow
google-kubernetes-engine
kubernetes
python-3.x

2 Answers

12/10/2018

There is nowhere to set a limitation on the number of HTTP requests at the k8s level for pods. You can review the full break down of the statefulset spec here and you will see that there is no field for a limitation on these requests.

Limiting factors for new HTTP requests could be the container image you are using. As an example, Apache web server limits can be found here. The limitation is likely built into the Airflow container you are using. Unfortunately, I can't find documentation that discusses this limit or how to increase it.

-- Patrick W
Source: StackOverflow

12/12/2018

I'm fairly certain the error you're seeing is from Airflow trying to fetch task logs from a worker via requests, which uses urllib3, that attempts retries on failed HTTP requests.

Your webserver is attempting to get the logs, being denied by the worker server, and is erroring out. Make sure you're running airflow serve-logs on all workers and that the port is open from your webserver to each.

-- joeb
Source: StackOverflow