I have a 3-node AKS kubernetes cluster deployed (kubenet is the networking overlay), with NGINX Ingress configured to do name-based routing to pods.
I have a number of identical applications deployed under different names on the cluster.
I can reach some applications via http, but not others. On close inspection, I see the applications that I can reach are all on the same node as the ingress controller and same internal 172.* subnet.
All the applications are in the same namespace as the ingress controller.
The unreachable applications are all on the other 2 nodes and different subnets. So it appears this is a networking configuration problem.
However, I cannot find what relevant configuration would allow the ingress to reach all applications regardless of which node and internal subnet they;re on. This, I believe, should be the default behaviour of Kubernetes.
How would I configure this desired behaviour?
Some test results:
kubectl logs https-ingress-controller-6bc79d6c69-7ljkb --namespace ingress-nginx --follow
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: 0.23.0
Build: git-be1329b22
Repository: https://github.com/kubernetes/ingress-nginx
-------------------------------------------------------------------------------
W0611 14:37:06.679648 6 flags.go:213] SSL certificate chain completion is disabled (--enable-ssl-chain-completion=false)
nginx version: nginx/1.15.9
W0611 14:37:06.685012 6 client_config.go:549] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0611 14:37:06.685884 6 main.go:200] Creating API client for https://172.17.0.1:443
I0611 14:37:06.712278 6 main.go:244] Running in Kubernetes cluster version v1.14 (v1.14.0) - git (clean) commit 641856db18352033a0d96dbc99153fa3b27298e5 - platform linux/amd64
I0611 14:37:07.055688 6 nginx.go:261] Starting NGINX Ingress controller
I0611 14:37:07.066491 6 event.go:221] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"tcp-services", UID:"56d2e0c2-8c47-11e9-8911-8272a7251f4e", APIVersion:"v1", ResourceVersion:"5775", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap ingress-nginx/tcp-services
I0611 14:37:07.067855 6 event.go:221] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"nginx-configuration", UID:"56cdccf4-8c47-11e9-8911-8272a7251f4e", APIVersion:"v1", ResourceVersion:"5774", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap ingress-nginx/nginx-configuration
I0611 14:37:07.075165 6 event.go:221] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"udp-services", UID:"56d6c9e3-8c47-11e9-8911-8272a7251f4e", APIVersion:"v1", ResourceVersion:"5776", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap ingress-nginx/udp-services
I0611 14:37:08.159406 6 event.go:221] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"ingress-nginx", Name:"https-ingress", UID:"103260ed-8c4a-11e9-8911-8272a7251f4e", APIVersion:"extensions/v1beta1", ResourceVersion:"17054", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress ingress-nginx/https-ingress
I0611 14:37:08.160481 6 backend_ssl.go:68] Adding Secret "ingress-nginx/chachingtls" to the local store
I0611 14:37:08.256541 6 nginx.go:282] Starting NGINX process
I0611 14:37:08.256572 6 leaderelection.go:205] attempting to acquire leader lease ingress-nginx/ingress-controller-leader-nginx...
I0611 14:37:08.257345 6 controller.go:172] Configuration changes detected, backend reload required.
I0611 14:37:08.261914 6 status.go:148] new leader elected: nginx-ingress-controller-6674b5b5dc-nhjcc
I0611 14:37:08.328794 6 event.go:221] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"ingress-nginx", Name:"https-ingress", UID:"103260ed-8c4a-11e9-8911-8272a7251f4e", APIVersion:"extensions/v1beta1", ResourceVersion:"17059", FieldPath:""}): type: 'Normal' reason: 'UPDATE' Ingress ingress-nginx/https-ingress
I0611 14:37:08.391940 6 controller.go:190] Backend successfully reloaded.
I0611 14:37:08.392044 6 controller.go:200] Initial sync, sleeping for 1 second.
[11/Jun/2019:14:37:09 +0000]TCP200000.000
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
durian 1/1 Running 0 12m 172.18.0.14 aks-agentpool-82039614-0 <none> <none>
https-ingress-controller-6bc79d6c69-mg7lm 1/1 Running 0 15m 172.18.2.11 aks-agentpool-82039614-2 <none> <none>
kiwi 1/1 Running 0 12m 172.18.2.14 aks-agentpool-82039614-2 <none> <none>
mango 1/1 Running 0 13m 172.18.2.12 aks-agentpool-82039614-2 <none> <none>
mangosteen 1/1 Running 0 12m 172.18.2.13 aks-agentpool-82039614-2 <none> <none>
orange 1/1 Running 0 12m 172.18.2.15 aks-agentpool-82039614-2 <none> <none>
kubectl exec -ti https-ingress-controller-6bc79d6c69-mg7lm /bin/bash -n ingress-nginx
www-data@https-ingress-controller-6bc79d6c69-7ljkb:/etc/nginx$
www-data@https-ingress-controller-6bc79d6c69-7ljkb:/etc/nginx$
www-data@https-ingress-controller-6bc79d6c69-7ljkb:/etc/nginx$ curl http://172.18.1.10:5678
^C
www-data@https-ingress-controller-6bc79d6c69-7ljkb:/etc/nginx$
www-data@https-ingress-controller-6bc79d6c69-7ljkb:/etc/nginx$
www-data@https-ingress-controller-6bc79d6c69-7ljkb:/etc/nginx$ curl http://172.18.2.9:5679
mango
www-data@https-ingress-controller-6bc79d6c69-7ljkb:/etc/nginx$ curl http://172.18.2.5:8080
<!-- HTML for static distribution bundle build -->
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Swagger UI</title>
<link rel="stylesheet" type="text/css" href="./swagger-ui.css" >
<link rel="icon" type="image/png" href="./favicon-32x32.png" sizes="32x32" />
<link rel="icon" type="image/png" href="./favicon-16x16.png" sizes="16x16" />
<style>
html
www-data@https-ingress-controller-6bc79d6c69-7ljkb:/etc/nginx$ curl http://172.18.1.9:5678
^C
I've destroyed and re-deployed the cluster and applications a number of times with the exact same configuration and the behaviour is the same.
It appears that in the case of the kubenet networking model, when using pre-existing VNET and subnet (not dedicated to AKS) the routing table with UDRs for the AKS nodes is not attached to the subnet the nodes are deployed to by default, which means that the pods have no way to reach each other across nodes.
The fact that UDRs need to be configured for kubenet is mentioned in the Microsoft Azure documentation, however no instructions are provided on actual setup of routing tables and UDR for AKS.
One has to create these routes after attaching the routing table to the AKS subnet, or add routes to the existing routing table of the subnet (if one exists).
The solution is documented here, it basically involves attaching the default route table generated by the AKS installation to the AKS subnet:
https://github.com/Azure/aks-engine/blob/master/docs/tutorials/custom-vnet.md
i.e, customise and run this script:
#!/bin/bash
rt=$(az network route-table list -g RESOURCE_GROUP_NAME_KUBE -o json | jq -r '.[].id')
az network vnet subnet update \
-g RESOURCE_GROUP_NAME_VNET \
--route-table $rt \
--ids "/subscriptions/SUBSCRIPTION_ID/resourceGroups/RESOURCE_GROUP_NAME_VNET/providers/Microsoft.Network/VirtualNetworks/KUBERNETES_CUSTOM_VNET/subnets/KUBERNETES_SUBNET"
I'm now able to reach pods on all nodes of the cluster via the Ingress.
NOTE: Alternatively, one could manually add the UDRs to whatever pre-existing routing table you could have attached to the pre-created AKS subnet prior to the AKS deployment.
For the kubelet network in AKS, the pods can communicate with each other. You see the description below:
With kubenet, nodes get an IP address from the Azure virtual network subnet. Pods receive an IP address from a logically different address space to the Azure virtual network subnet of the nodes. Network address translation (NAT) is then configured so that the pods can reach resources on the Azure virtual network. The source IP address of the traffic is NAT'd to the node's primary IP address.
The pods can communicate with others go through the node with NAT. And only the nodes can receive a routable IP address. You can see the routes in the portal like this:
And Azure will do all the things for you. It works well in my side. So if it does not work for you. Then you can take a check if the routes are alright.
Here is the screenshot that test the communicate for the pods in different address space: