I have JupyterHub 0.7.0 deployed to AWS managed kubernetes (EKS)
I need to collect metrics for users that are downloading files from their individual user notebook servers.
Questions:
Note: By "file download" I am referring to the "Download" button that is available on the Jupyter home page (see the provided screenshot)
I was able to get the data I need from AWS ELB access logs.
This required me to change a configuration setting. Specifically, the "proxy-public" ELB listener must be listening for HTTP traffic, not TCP traffic. (The "proxy-public" ELB listener is created implicitly with the JupyterHub helm application install)
The ELB access logs have a property for request
. For a file download, the request
property is formatted like so:
GET https://{DOMAIN}:443/user/{USERNAME}/files/{FILENAME}?download=1 HTTP/1.1
Where DOMAIN
is the hosted domain for JupyterHub
Where USERNAME
is the JupyterHub user
Where FILENAME
is the file that was downloaded
You should be able to do that getting the logs from /var/log/jupyterhub.log
, and looking for the filename of the file you want to do metrics for.
If you provide the logs I might be able to help further.
But if the needed logs are not there they I think you will need to get those directly from Jupyter users:
docker logs jupyter-<user_name>