How to define name for s3bucket for different environment in Kafka Sink

10/29/2021

I am currently setting up my aws s3 bucket for different environments so I can have data in dev, tqa, stg, and prd. The name of my bucket in dev is s3.dev.kafka.sink while in tqa it is named as s3.tqa.kafka.sink each associated with its correct env. The documentation in the Kafka Connect website doesn't specify how to be set the environments, so I did the following way, however I keep getting errors that the bucket name is not named properly.

I put it in the secret yaml file

apiVersion: kubernetes-client.io/v1
kind: ExternalSecret
metadata:
   name: kafka-sink-s3-secret
   namespace: namespace
spec:
   backendType: secretManager
   data:
       -key: s3.tqa.kafka.sink
        name: bucket_name
        property: bucket_name

While in deployment file

env:
   -name: bucket_name
    valueFrom:
        secretKeyRef:
         name:kaka-sink-s3-secret
         key: bucket_name

And I will specify the bucket name in the config: "s3.bucket.name":"'"$bucket_name"'"

But it fails to deploy. Any idea how can i specify as s3.{{ENV}}.kafka.sink so it runs the correct bucket name in their own env in aws

-- engicode123
amazon-s3
apache-kafka
apache-kafka-connect
kubernetes

1 Answer

10/29/2021

Out of the box, Kafka Connect doesn't have any way to access environment variables other than those defined by the AWS SDK (the keys and profile, at least)

Sounds like you will need to use a ConfigProvider of the Kafka Connect API

Here's one example on Github, which you'd need to compile and load into your Docker images - https://github.com/giogt/kafka-env-config-provider

Inside the connector properties, use like this

"bucket.name": "${env:ENVIRONMENT_VARIABLE_NAME}"

You should be able to use Helm to better separate/template out the full bucket name within the secret/deployment resource definition

-- OneCricketeer
Source: StackOverflow