I cannot access Kafka brokers externally (from a public IP address).
I am using https://github.com/Yolean/kubernetes-kafka
It has a very good guide, but I believe their built in method of exposing ports publicly does not work since I am running this cluster privately in a private/public VPC on AWS.
I believe their built in method of outside access simply exposes host ports on private subnet address (Is this correct?)
I know I can set up a load balancer per broker and alias a domain to each load balancer. But then I'm incurring extra costs on load balancers.
I have been looking at ingress resources, have successfully setup an nginx controller that would communicate to different services based on the url path to the host domain.
However with nginx I would receive a 503 Service Temporarily Unavailable with a curl to the url (would get success on echoserver url). So I quickly realised that http requests aren't making sense here. Not to the brokers anyway?
I'm now stuck on learning nginx and a succesful way of proxying the requests.
Is there a specific proxy protocol I should use?
This could also be incorrect server.properties config.
When using nginx I would have the ingress resource connect to the outside-${BROKER_ID} services (I changed the first one to a clusterIP service, others stayed as NodePort). To me this is external DNS mapping to internal IPs. So I would think that the default listeners setting on the Kafka server.properties are OK for this?. Otherwise should the listener become the domain aliased to the load balancer? I had tried the domain with the URL path as an advertised listener, but that didn't make any sense to me, and resulted in crash loops!
For anyone wanting to look at configs, I'm currently running with the default (kinda, 5 pzoos, no ezoos [they were always stuck as pending]) version of:
https://github.com/Yolean/kubernetes-kafka
This can be setup very quickly on an existing cluster (For AWS):
git clone https://github.com/Yolean/kubernetes-kafka
cd kubernetes-kafka
(AWS) rm configure/!(aws*)
kubectl apply -f configure
kubectl apply -f 00-namespace*
kubectl apply -f rbac*
kubectl apply -f zookeeper
kubectl apply -f kafka
kubectl config set-context $(kubectl config current-context) --namespace=kafka
I am running this version of nginx
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/mandatory.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/provider/aws/service-l4.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/provider/aws/patch-configmap-l4.yaml
The echoserver came from:
https://github.com/kubernetes/kops/tree/master/addons/ingress-nginx
Specifc lines used:
kubectl run echoheaders --image=k8s.gcr.io/echoserver:1.4 --replicas=1 --port=8080
kubectl expose deployment echoheaders --port=80 --target-port=8080 --name=echoheaders-x
kubectl expose deployment echoheaders --port=80 --target-port=8080 --name=echoheaders-y
Here is my ingress resource for nginx:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: echomap
# annotations:
# nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: brokers.my-domain.com
http:
paths:
- path: /broker0
backend:
serviceName: outside-0
servicePort: 31100
- path: /broker1
backend:
serviceName: outside-1
servicePort: 31101
- path: /broker2
backend:
serviceName: outside-2
servicePort: 31102
- path: /bar
backend:
serviceName: echoheaders-y
servicePort: 80
- path: /foo
backend:
serviceName: echoheaders-x
servicePort: 80
EDIT: I focused on getting external access through load balancers, and somewhat succeeded. Problems can be found here https://serverfault.com/questions/949367/can-connect-to-kafka-but-cannot-consume
Pretty sure nginx isn't going to work as ingress? I can't figure out how a http request becomes a TCP request.
Moving on to internal kafka streams apps now. Will come back to this when having a separate cluster for kafka streams becomes more necessary.