I want to deploy kafka on kubernetes.
Because I will be streaming with high bandwidth from the internet to kafka I want to use the hostport and advertise the hosts "dnsName:hostPort" to zookeeper so that all traffic goes directly to the kafka broker (as opposed to using nodeport and a loadbalancer where traffic hits some random node which redirects it creating unnecessary traffic).
I have setup my kubernetes cluster on amazon. With kubectl describe node ${nodeId}
I get the internalIp, externalIp, internal and external Dns name of the node.
I want to pass the externalDns name to the kafka broker so that it can use it as advertise host.
How can I pass that information to the container? Ideally I could do this from the deployment yaml but I'm also open to other solutions.
Matthew L Daniels answer describes the valid approach of querying the kubernetes api using the nodename which is obtained by an env var. The difficulty lies in giving the pod the proper rbac access and setting up an init Container.
Here the kubernetes yml that implements this with an init container using the python kubernetes client:
### This serviceAccount gives the kafka sidecar the permission to query the kubernetes API for node information so that it can find out the advertise host (node public dns name) for the kafka which uses hostPort to be as efficient as possible.
apiVersion: v1
kind: ServiceAccount
metadata:
name: node-reader-service-account
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: node-reader-cluster-role
rules:
- apiGroups: [""] # The API group "" indicates the core API Group.
resources: ["nodes"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: read-nodes-rolebinding
subjects:
- kind: ServiceAccount # May be "User", "Group" or "ServiceAccount"
name: node-reader-service-account
namespace: default
roleRef:
kind: ClusterRole
name: node-reader-cluster-role
apiGroup: rbac.authorization.k8s.io
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
creationTimestamp: null
labels:
io.kompose.service: kafka
name: kafka
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
creationTimestamp: null
labels:
io.kompose.service: kafka
spec:
serviceAccountName: node-reader-service-account
containers:
- name: kafka
image: someImage
resources: {}
command: ["/bin/sh"]
args: ["-c", "export KAFKA_ADVERTISED_LISTENERS=$(cat '/etc/sidecar-data/dnsName') && env | grep KAFKA_ADVERTISED_LISTENERS && /start-kafka.sh"]
volumeMounts:
- name: sidecar-data
mountPath: /etc/sidecar-data/
initContainers:
- name: kafka-sidecar
image: sidecarImage
command: ["python"]
args: ["/script/getHostDnsName.py", "$(KUBE_NODE_NAME)", "/etc/sidecar-data/dnsName"]
env:
- name: KUBE_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
volumeMounts:
- name: sidecar-data
mountPath: /etc/sidecar-data/
volumes:
- name: sidecar-data
emptyDir: {}
restartPolicy: Always
status: {}
How can I pass that information to the container? Ideally I could do this from the deployment yaml but I'm also open to other solutions.
The first thing I would try is envFrom: fieldRef:
and see if it will let you reach into the PodSpec
's status:
field to grab the nodeName
. I deeply appreciate that isn't the ExternalDnsName
you asked about, but if fieldRef
works, it could be a lot less typing and thus could be a good tradeoff.
But, with "I'm also open to other solutions" in mind: don't forget that -- unless instructed otherwise -- each Pod is able to interact with the kubernetes API, and with the correct RBAC permissions it can request the very information you're seeking. You can do that either as a command:
override, to do setup work before launching the kafka broker, or you can do that work in an init container, write the external address into a shared bit of filesystem (with volume: emptyDir: {}
or similar), and then any glue code for slurping that value into your kafka broker.
I am 100% certain that the envFrom: fieldRef:
construct that I mentioned earlier can acquire the metadata.name
and metadata.namespace
of the Pod, at which point the Pod can ask the kubernetes API for its own PodSpec
, extract the nodeName
from the aforementioned status:
field, then ask the kubernetes API for the Node info, and voilĂ , you have all the information kubernetes knows about that Node.