I'm running an nsq cluster in Docker containers using the following docker-compose.yaml file:
version: '2'
services:
nsqlookupd:
image: nsqio/nsq
command: /nsqlookupd
ports:
- "4160"
- "4161:4161"
nsqd:
image: nsqio/nsq
command: /nsqd --lookupd-tcp-address=nsqlookupd:4160 --data-path=/data
volumes:
- data:/data
ports:
- "4150:4150"
- "4151:4151"
nsqadmin:
image: nsqio/nsq
command: /nsqadmin --lookupd-http-address=nsqlookupd:4161
ports:
- "4171:4171"
volumes:
data:
Everything runs fine. But, if I call the /nodes endpoint on the nsqdlookup server I get this:
$ http http://localhost:4161/nodes
HTTP/1.1 200 OK
Content-Length: 238
Content-Type: application/json; charset=utf-8
Date: Tue, 24 Jan 2017 08:44:27 GMT
{
"data": {
"producers": [
{
"broadcast_address": "7dd3d550e7f8",
"hostname": "7dd3d550e7f8",
"http_port": 4151,
"remote_address": "172.18.0.4:57156",
"tcp_port": 4150,
"tombstones": [],
"topics": [],
"version": "0.3.8"
}
]
},
"status_code": 200,
"status_txt": "OK"
}
The broadcast address looks like the container's name/hostname. I tried to ping on port 4151 it just in case, but it fails.
> http http://7dd3d550e7f8:4151/ping
http: error: ConnectionError: HTTPConnectionPool(host='7dd3d550e7f8', port=4151): Max retries exceeded with url: /ping (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x000001C397173EF0>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed',)) while doing GET request to URL: http://7dd3d550e7f8:4151/ping
Same for the remote address:
> http http://172.18.0.4:4151/ping
http: error: ConnectionError: HTTPConnectionPool(host='172.18.0.4', port=4151): Max retries exceeded with url: /ping (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x000001C0D9545F28>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond',)) while doing GET request to URL: http://172.18.0.4:4151/ping
Everything works if I use localhost or 127.0.0.1:
> http http://localhost:4151/ping
HTTP/1.1 200 OK
Content-Length: 2
Content-Type: text/plain; charset=utf-8
Date: Tue, 24 Jan 2017 08:51:30 GMT
OK
But, that's cheating. The whole point of the nsqlookupd servers is that they keep track on the nsqd servers so clients can dynamically get a list of responsive servers.
Is it possible to an accessible URL/IP address for nsqd nodes from nslookupd server when the nsqd nodes are running in Docker containers?
Is there some magic incantation to make it work?
Did someone try maybe using Swarm or Kubernetes?
I found that GKE now supports StatefulSet at 1.5.2 It means your nsqd, nsqlookupd can be spin to as SS instances. Now you can use -broadcast-address=$POD_IP from downward api and your producers will be able to publish to nsq-0.nsq-service-name, nsq-1.nsq-service-name etc., while consumers will get advertised nsqd IP address from nsqlookupd. That works for us. Just managed to make it to work today