Implementing default stackdriver behavior in GKE

1/2/2019

I am setting up a GKE cluster for an application that has structured json logging that works very well with Kibana. However, I want to use stackdriver instead.

I see that the application's logs are available in stackdriver with the default cluster configurations. The logs appear as jsonpayload but I want more flexibility and configuration and when I do that following this guide, all of the logs for the same application appear only as textpayload. Ultimately, I want my logs to continue to show up in jsonpayload when I use fluentd agent configurations to take advantage of the label_map.

I followed the guide on removing the default logging service and deploying fluentd agent with an existing cluster with the below GKE versions.

Gcloud version info:

Google Cloud SDK 228.0.0
bq 2.0.39
core 2018.12.07
gsutil 4.34

kubectl version info:

Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:17:39Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"10+", GitVersion:"v1.10.9-gke.5", GitCommit:"d776b4deeb3655fa4b8f4e8e7e4651d00c5f4a98", GitTreeState:"clean", BuildDate:"2018-11-08T20:33:00Z", GoVersion:"go1.9.3b4", Compiler:"gc", Platform:"linux/amd64"}

gcloud container cluster describe snippet:

addonsConfig:
  httpLoadBalancing: {}
  kubernetesDashboard:
    disabled: true
  networkPolicyConfig:
    disabled: true
createTime: '2018-12-24T19:31:21+00:00'
currentMasterVersion: 1.10.9-gke.5
currentNodeCount: 3
currentNodeVersion: 1.10.9-gke.5
initialClusterVersion: 1.10.9-gke.5
ipAllocationPolicy: {}
legacyAbac: {}
location: us-central1-a
locations:
- us-central1-a
loggingService: none
masterAuth:
  username: admin
masterAuthorizedNetworksConfig: {}
monitoringService: monitoring.googleapis.com
name: test-cluster-1
network: default
networkConfig:
  network: projects/test/global/networks/default
  subnetwork: projects/test/regions/us-central1/subnetworks/default
networkPolicy: {}
nodeConfig:
  diskSizeGb: 100
  diskType: pd-standard
  imageType: COS
  machineType: n1-standard-1
  serviceAccount: default
nodeIpv4CidrSize: 24
nodePools:
- autoscaling: {}
  config:
    diskSizeGb: 100
    diskType: pd-standard
    imageType: COS
    machineType: n1-standard-1
    serviceAccount: default
  initialNodeCount: 3
  management:
    autoRepair: true
    autoUpgrade: true
  name: default-pool
  status: RUNNING
  version: 1.10.9-gke.5
status: RUNNING
subnetwork: default
zone: us-central1-a

Below is what is included in my configmap for the fluentd daemonset:

<source>
  type tail
  format none
  time_key time
  path /var/log/containers/*.log
  pos_file /var/log/gcp-containers.log.pos
  time_format %Y-%m-%dT%H:%M:%S.%N%Z
  tag reform.*
  read_from_head true
</source>
<filter reform.**>
  type parser
  format json
  reserve_data true
  suppress_parse_error_log true
  key_name log
</filter>

Here is an example json log from my application: {"log":"org.test.interceptor","lvl":"INFO","thread":"main","msg":"Inbound Message\n----------------------------\nID: 44\nResponse-Code: 401\nEncoding: UTF-8\nContent-Type: application/json;charset=UTF-8\nHeaders: {Date=[Mon, 31 Dec 2018 14:43:47 GMT], }\nPayload: {\"errorType\":\"AnException\",\"details\":[\"invalid credentials\"],\"message\":\"credentials are invalid\"}\n--------------------------------------","@timestamp":"2018-12-31T14:43:47.805+00:00","app":"the-app"}

The result with the above configuration is below:

{
insertId:  "3vycfdg1drp34o"  
labels: {
 compute.googleapis.com/resource_name:  "fluentd-gcp-v2.0-nds8d"   
 container.googleapis.com/namespace_name:  "default"   
 container.googleapis.com/pod_name:  "the-app-68fb6c5c8-mq5b5"   
 container.googleapis.com/stream:  "stdout"   
}
logName:  "projects/test/logs/the-app"  
receiveTimestamp:  "2018-12-28T20:14:04.297451043Z"  
resource: {
 labels: {
  cluster_name:  "test-cluster-1"    
  container_name:  "the-app"    
  instance_id:  "234768123"    
  namespace_id:  "default"    
  pod_id:  "the-app-68fb6c5c8-mq5b5"    
  project_id:  "test"    
  zone:  "us-central1-a"    
 }
 type:  "container"   
}
severity:  "INFO"  
textPayload:  "org.test.interceptor"  
timestamp:  "2018-12-28T20:14:03Z"  
}

I have even tried wrapping the json map into one field since it appears that only the "log" field is being parsed. I considered explicitly writing a parser but this seemed infeasible considering the log entry is already in json format and also the fields change from call to call and having to anticipate what fields to parse would not be ideal.

I expected that all of the fields in my log would appear in jsonPayload in the stackdriver log entry. I ultimately want to mimic what occurs with the default logging stackdriver service on a cluster where our logs at least appeared as jsonPayload.

-- akilah2010
fluentd
google-cloud-stackdriver
google-kubernetes-engine
stackdriver

1 Answer

1/8/2019

I suspect type tail - format none in your configmap for the fluentd daemonset is not helping. Can you try setting the format to json or multiline, and update ?

type tail

format none

-- Asif Tanwir
Source: StackOverflow