How to parse istio-proxy logs with fluent-bit

8/22/2019

Need help figuring out how to parse istio-logs with fluent-bit or how to make fluentbit parsing more "debuggable" (which "bit" is failing)

Find my fluent-bit config below

  input-kubernetes.conf: |
    [INPUT]
        Name              tail 
        Tag_Regex         (?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})\.log$
        Tag               k8s_containers.<namespace_name>.<container_name>.<pod_name>.<docker_id>-                                                                                                                  
        Path              /var/log/containers/*.log       
        Exclude_Path      /var/log/containers/mailhog*    
        Parser            docker        
        DB                /var/log/flb_kube.db            
        Mem_Buf_Limit     5MB  
        Skip_Long_Lines   On   
        Refresh_Interval  10   

  filter-kubernetes.conf: |    
    [FILTER]
        Name                kubernetes
        Match               k8s_containers.**
        Kube_Tag_Prefix     k8s_containers.
        Kube_URL            https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT_HTTPS}                                                                                                                             tls.verify          Off
        K8S-Logging.Parser  On 
        K8S-Logging.Exclude On 
        Merge_Log           On 
        Merge_Log_Key       log_processed
        Regex_Parser        k8s-custom-tag

    [FILTER]
        Name                istioproxy
        Match               k8s_containers.*.istio-proxy    
        Key_Name            log
        Parser              envoy

  output-elasticsearch.conf: |
    [OUTPUT]
        Name            es
        Match           *
        Host            ${FLUENT_ELASTICSEARCH_HOST}
        Port            ${FLUENT_ELASTICSEARCH_PORT}
        Logstash_Format On
        Replace_Dots    On
        Retry_Limit     False
        Time_Key        timestamp-es
        Include_Tag_Key On
        tls             On
        tls.verify      Off
        tls.crt_file    /fluent-bit/etc/certs/tls.crt
        tls.key_file    /fluent-bit/etc/certs/tls.key
        HTTP_User       elastic
        HTTP_Passwd     ${ELASTIC_PASSWORD}

<<snip>>

  parsers.conf: |
    [PARSER]
        Name    k8s-custom-tag
        Format  regex
        Regex   (?<namespace_name>[^_]+)\.(?<container_name>.+)\.(?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)\.(?<docker_id>[a-z0-9]{64})-$


    [PARSER]
        Name        nginx-ingress
        Format      regex
        Regex       ^(?<host>[^ ]*) - \[(?<real_ip>[^ ]*)\] - (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] (?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<last>[^$]*)
        Time_Key    time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name        envoy
        Format      regex
        Regex       \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)? (?<protocol>\S+)" (?<code>[^ ]*) (?<dunno1>[^ ]*) "(?<dunno2>\S+)" "(?<dunno3>\S+)" (?<maybe_bytes_sent>[^ ]*) (?<maybe_bytes_received>[^ ]*) (?<maybe_duration>[^ ]*) (?<maybe_duration2>[^ ]*) "(?<ips>[0-9\.\-,\s]*)" "(?<agent>[^\"]*)" "(?<request_id>[^\"]*)" "(?<url>[^\"]*)" "(?<destination_ip>[^\"]*)" (?<magic>[^ ]*) (?<dunno4>[^ ]*) (?<another_ip>[^ ]*) (?<another_ip2>[^ ]*) (?<magic2>[^ ]*)
        #Time_Keep   Off
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L

I do have an otherwise working solution - nginx-ingress logs are parsed correctly.

With istio-proxy injected containers i have to do the parser-matching myself (no annotations on these).

I found some resources, like https://blog.donbowman.ca/2018/10/03/the-rabbit-hole-of-log-parsing-istio-proxy-sidecar-log-routing-and-parsing-with-fluent-bit/

My envoy regex seems to work on all random istio-proxy log checks via https://regex101.com/ or https://rubular.com/

[2019-08-23T00:39:51.546Z] "GET /api/authenticate?login_challenge=30cd90e278ff4591a07e9bef80235a7b HTTP/1.1" 200 - "-" "-" 0 1284 25 25 "25.25.25.25" "Go-http-client/1.1" "57d14491b1441a5634f0354d6425b882" "login.example.com" "127.0.0.1:80" inbound|80|http|frontend.uat.svc.cluster.local - 10.233.120.18:80 25.25.25.25:0 outbound_.80_._.frontend.uat.svc.cluster.local

Unfortunately, Istio-proxy logs appear in kibana, but in unparsed way.

Help appreciated!

-- strzelecki.maciek
fluent
fluent-bit
istio
kubernetes
regex

2 Answers

9/24/2019

I found the same issue when testing fluent-bit:v1.2.1 on my minikube cluster. So i dug around and did some modification on envoy filter but nothing worked.

Match: kube.*.istio-proxy is the problem. Routing seems so obscure. So i changed the match rules to kube.istio-proxy.* and changed the tag definition in input section to kube.<container_name>.<namespace_name>.xxxx and parser section.

And It worked!

UPDATED: After read the parser document i add Reserve_Data to my filter

[FILTER]
        Name                parser
        Match               kube.istio-proxy.*
        Key_Name            log
        Reserve_Data        On 
        Parser              envoy
-- Chesh
Source: StackOverflow

8/25/2019

Some breaking changes were made in tag parsing. Check that your kube.*.istio-proxy is matching. My guess is you now have a very inconvenient path in the middle.

I had changed this about 18 mo ago so that the tag was kube.namespace.pod.container, but then it was changed to be kube..namespace. That would be my guess here, that the tag is not matching.

see https://docs.fluentbit.io/manual/installation/upgrade_notes#fluent-bit-v-1-1 ... you will now need to put the original tail input path in the tag.

-- Don Bowman
Source: StackOverflow