I've been struggling with this all weekend, and I now on my knees hoping one of you geniuses can solve my problem.
I short: I have an ingress-nginx controller (Image: nginx/nginx-ingress:1.5.8) with whom I'm trying to achieve a self-signed mutual authentication.
The https aspect works all fine, but the problem I'm having (I think) is that the ingress controller reroute the request with the default cert and the ingress validates with the default CA(because it can't find my CA).
So.. Help!
Steps I've gone through on this cluster-f*** of a journey (pun intended):
I've tested it in a local Minikube-cluster and it all works like a charm. When I exec -it into the ingress-controller-pod and cat the nginx.conf for both my clusters (Minikube and Azure) I did find large differences; hence I just found out that I'm working with apples and pears in terms of minikube- vs azure-k8s nginx-ingresses.
This is the ingress setup that worked as a charm for my minikube cluster (the ingress I'm using is more or less a duplicate of the file you'll find in the link): https://kubernetes.github.io/ingress-nginx/examples/auth/client-certs/
In addition i found this which in a long way describes the problem that I'm having: https://success.docker.com/article/how-to-configure-a-default-tls-certificate-for-the-kubernetes-nginx-ingress-controller From the link above the solution is simple; nuke the ingress from orbit and create a new one. Well.. Here's the thing, this is a production cluster and my bosses would be all but pleased if I did that.
Another discovery that I made whilst "exec -it bash"-roaming around inside the Azure-ingress-controller is that there is no public root cert folder (/etc/ssl/) to be found. Do not know why, but though I'd mention it.
I've also discovered the param --default-ssl-certificate=default/foo-tls, but this is a default. As there will be other needs for different client-auths later I have to be able to specify dynamic CA-certs for different ingresses.
I'll past my nginx.conf that I think is the problem below. Hoping to hear back from some of you because at this point in time I'm thoroughly lost. Hit me up if additional information is needed.
user nginx;
worker_processes auto;
daemon off;
error_log /var/log/nginx/error.log notice;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
#tcp_nopush on;
keepalive_timeout 65s;
keepalive_requests 100;
#gzip on;
server_names_hash_max_size 512;
variables_hash_bucket_size 256;
variables_hash_max_size 1024;
map $http_upgrade $connection_upgrade {
default upgrade;
'' close;
}
server {
listen 80 default_server;
listen 443 ssl default_server;
ssl_certificate /etc/nginx/secrets/default;
ssl_certificate_key /etc/nginx/secrets/default;
server_name _;
server_tokens "on";
access_log off;
location / {
return 404;
}
}
# stub_status
server {
listen 8080;
allow 127.0.0.1;
deny all;
location /stub_status {
stub_status;
}
}
server {
listen unix:/var/run/nginx-status.sock;
access_log off;
location /stub_status {
stub_status;
}
}
include /etc/nginx/config-version.conf;
include /etc/nginx/conf.d/*.conf;
server {
listen unix:/var/run/nginx-502-server.sock;
access_log off;
location / {
return 502;
}
}
}
stream {
log_format stream-main '$remote_addr [$time_local] '
'$protocol $status $bytes_sent $bytes_received '
'$session_time';
access_log /var/log/nginx/stream-access.log stream-main;
So the problem came down to the ingress-controller being old and outdated. Didn't have the original helm-chart that is was deployed with so I was naturally worried about rollback options. Anyhoo -> took a leap of faith in the middle of the night local time and nuked the namespace; recreated the namespace; helm install stable/nginx-ingress.
There was a minimum downtime -> 1 min at most, but beware to lock down the public IP that's attached to the load balancer before going all 3.rd world war on your services.
Had to add an argument to the standard azure helm install command to imperatively set the public IP for the resource; pasting it below if any poor soul should find himself in the unfortunate situation with a new helm-cli and lost charts.
That's it; keep your services up to date and make sure to save your charts!
helm install nginx stable/nginx-ingress --namespace ingress-basic \
--set controller.replicaCount=2 \
--set controller.nodeSelector."beta\.kubernetes\.io/os"=linux \
--set defaultBackend.nodeSelector."beta\.kubernetes\.io/os"=linux \
--set controller.service.loadBalancerIP=*YourVeryPreciousPublicIP*