Removing color codes from Java messages in Kubernetes with fluentbit


We are trying FluentBit as we couldn't manage to remove color codes in multiline text with FluentD.

On Kubernetes, I have an application that outputs Java error traces in STDOUT/STDERR, so Kubernetes merges those logs in /var/log/containers/* in JSON format.

I can break down and create the multiline filter successfully, but I cannot remove the color codes from the output.

I followed those instructions: with no success, even as they are for version 1.2 and we are using 1.5, where its recommended to remove the Format encoding. Please, find the examples and my fluentbit config below:

Log entry:

{"log":"\u001b[2m2021-06-03 15:34:27.056\u001b[0;39m \u001b[31mERROR [account,9decd63637c3167b,9decd63637c3167b,true]\u001b[0;39m \u001b[35m1\u001b[0;39m \u001b[2m---\u001b[0;39m \u001b[2m[nio-8080-exec-4]\u001b[0;39m \u001b[36md.q.a.c.c.GenericControllerAdvice       \u001b[0;39m \u001b[2m:\u001b[0;39m An exception occurred: GenericResponse(status=400, code=1.2, message=Invalid object, path=/)\r\n","stream":"stdout","time":"2021-06-03T15:34:27.056731062Z"}

Log field output in Kibana:

�[0;39m �[35m1�[0;39m �[2m---�[0;39m �[2m[nio-8080-exec-4]�[0;39m �[36md.q.a.c.c.GenericControllerAdvice �[0;39m �[2m:�[0;39m An exception occurred: GenericResponse(status=400, code=1.2, message=Invalid object, path=/)

As you can see, "[0;39m","[35m1","[2m", etc color codes have been saved in ES with or without Format field as "encode", "encode_utf8", "json" in Parser applied.

I also tried creating a filter with "(\s\S*?)([0-9m|[0-9m|[0;0-9{2}m|[0-9{2}m|)" regex that should work on regex testers, but unfortunately logs come in empty to Kibana if we use this filter.

Here you have the config we are working on currently:

fluent-bit.conf: |
  Flush                     1
  Log_Level                 debug
  Daemon                    off
  Parsers_File              custom-parsers.conf
  Name                      tail
  Tag                       kube.*
  Path                      /var/log/containers/*service*.log
  Multiline                 On
  #Multiline_Flush           5
  Parser_Firstline          firstline
  Parser_1                  line_0
  Name                      es
  Match                     kube.*
  Host                      ${FLUENT_ELASTICSEARCH_HOST}
  Port                      ${FLUENT_ELASTICSEARCH_PORT}
  Logstash_Format           On
  Retry_Limit               False
  Logstash_Prefix           ${FLUENT_ELASTICSEARCH_INDEX}
  Time_Key                  time
  Generate_ID               On
  Name                      parser
  Match                     kube.*
  Key_Name                  log
  Parser                    java

custom-parsers.conf: |

  Name                      firstline
  Format                    regex
  Regex                     ^(\{"log":")?(\\[a-z]\d{3}[a-z]\[\dm)?(?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{3})(\\[a-z]\d{3}[a-z]\[[\w; ]+)? (\\[\w]+\[[\w]+)? ?([\w]\\\[+)?(?<level>WARN|INFO|ERROR) \[(?<service>[a-z]+),([\w]+)?,([\w]+)?,(true|false)?\](?<log>([\s\S]*?)(\[[0-9]m|\[[0-9][0-9]m|\[0;[0-9]{2}m|\[[0-9]{2}m|))","stream":.*$
  Time_Key                  time
  Time_Format               %Y-%m-%d %H:%M:%S.%L
  #Decode_Field_As           json         log     do_next
  Decode_Field_As           escaped_utf8 log

  Name                      line_0
  Format                    regex
  Regex                     ^(\{"log":")?(\\[a-z]\d{4})?(?<log>([\s\S]*?)(\[[0-9]m|\[[0-9][0-9]m|\[0;[0-9]{2}m|\[[0-9]{2}m|))\\r\\n","stream":.*$
  #Decode_Field_As           json         log     do_next
  Decode_Field_As           escaped_utf8 log 

  Name                      java
  Format                    regex
  Regex                     ([\s\S]*?)(\[[0-9]m|\[[0-9][0-9]m|\[0;[0-9]{2}m|\[[0-9]{2}m|)
-- Carlos Eduardo Santos

0 Answers