Production logging on Docker orchestrator

7/8/2019

I have developed a web app which consists in a frontend (VueJS app) and backend (Flask Python3 app). In production mode, I will be using docker and Kubernetes as orchestrator.

I have set up Flask logging to log on two different files on directory

/var/log/appname

and it is working fine.

The problem is: when in production, if Kubernetes instantiate multiple backends at the same time, every container will produce its own logs.

Is there a way to centralize the logging process having only one file?

I am sorry, but I am not an expert in orchestration and maybe there is no reason to ask this question.

Thanks.

-- Massimo Lavermicocca
docker
flask
kubernetes
python-3.x

1 Answer

7/9/2019

As far as I understand I have mixed feelings with entralizing logs from multiple containers in one files, data will be mixed. But good solution is to attach extra volume.

It's possible to create a Docker/Kubernetes volume (or many volumes) directed at where those log files reside within the application. By leveraging Docker templating, it's possible to suffix each volume with the service task ID (an integer). Placing a suffix on the volume name prevents any collisions in regards to logging should multiple service tasks run on the same host. A global service needs to be created that runs a logging agent with directory wildcard support. Finally, additional regex can be setup via the logging utility that turns the source directory of the file, into an indexed value.

Example below shows how this can be accomplished using the official Tomcat image. The official Tomcat image logs several files in /usr/local/tomcat/logs, much like most Java applications. In that path, files such as catalina.2017-07-06.log, host-manager.2017-07-06.log, localhost.2017-07-06.log, localhost_access_log.2017-07-06.txt, and manager.2017-07-06.log can be found.

  1. Create a global service for the logging utility that mounts /var/lib/docker/volumes:/logs/volumes.

  2. Create a logging rule for the logging agent that logs using a rule similar to this generic example: "/log/volumes/*/_data/*.log".

  3. Launch a service using go based templating on the volumes:

    When launching the service, use these parameters:

    docker service create \
    -d \
    --name prod-tomcat \
    --label version=1.5 \
    --label environment=prod \
    --mount type=volume,src="{{.Task.Name}}",dst=/usr/local/tomcat/logs \
    --replicas 3 \
    tomcat:latest
    

    If both replicas schedule on the same node, then two volumes containing the logs will be create on the host prod-tomcat.1.oro7h0e5yow69p9yumaetor3l and prod-tomcat.2.ez0jpuqe2mkl6ppqnuffxqagl.

  4. As long as the logging agent supports wildcards and handles any log rotation by checking the inode (not the file), then the logs should be collected.

  5. If the application logs to multiple locations, then try to symlink the logs to a single directory or add a descriptive name to the volume. If a descriptive name is added to the volume name, then any sort of extraction will need to be updated to accommodate that change. (i.e. with a grok)

  6. Most loggers should collect the file path as well as the log contents. By turning the volumes where the log files reside into indexable fields it's possible to search and aggregate information from these types of applications. Here is an example that uses a grok pattern and creates two new indexable fields, CONTAINER_NAME and FILENAME.

    match => { "path" => "/log/volumes/%{DATA:CONTAINER_NAME}/_data/%{GREEDYDATA:FILE_NAME}" }
  7. The CONTAINER_NAME will match the output of the stdout stream from the container, making it easy to filter based on the container's logs.

More information to a repository with a working example can be found in the swarm-elk repo.

More information about logging process you can find here: logging.

-- MaggieO
Source: StackOverflow