We are running Istio 1.1.3 on 1.12.5-gke.10 cluster-nodes.
We use certmanager for managing our let's encrypt certificates.
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
name: certs.ourdomain.nl
namespace: istio-system
spec:
secretName: certs.ourdomain.nl
newBefore: 360h # 15d
commonName: operations.ourdomain.nl
dnsNames:
- operations.ourdomain.nl
issuerRef:
name: letsencrypt
kind: ClusterIssuer
acme:
config:
- http01:
ingressClass: istio
domains:
- operations.ourdomain.nl
Next thing we see the acme backend, service (nodeport and ingress) deployed. The ingress (auto-generated) looks like this:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: istio
generateName: cm-acme-http-solver-
generation: 1
labels:
certmanager.k8s.io/acme-http-domain: "1734084804"
certmanager.k8s.io/acme-http-token: "1476005735"
name: cm-acme-http-solver-69vzw
namespace: istio-system
ownerReferences:
- apiVersion: certmanager.k8s.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: Certificate
name: certs.ourdomain.nl
uid: 751011d2-4fc8-11e9-b20e-42010aa40101
spec:
rules:
- host: operations.ourdomain.nl
http:
paths:
- backend:
serviceName: cm-acme-http-solver-fzk8q
servicePort: 8089
path: /.well-known/acme-challenge/dnrcr-LRRMdXhBaUefjqpHQx8ytYuk-feEfXu9gW-Ck
status:
loadBalancer: {}
However, when we try to access the url operations.ourdomain.nl /.well-known/acme-challenge/dnrcr-LRRMdXhBaUefjqpHQx8ytYuk-feEfXu9gW-Ck we get a 404.
We do have a loadbalancer for istio:
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
labels:
app: istio-ingress
chart: gateways-1.1.0
heritage: Tiller
istio: ingress
release: istio
name: istio-ingress
namespace: istio-system
spec:
selector:
app: istio-ingress
servers:
- hosts:
- operations.ourdomain.nl
#port:
# name: http
# number: 80
# protocol: HTTP
#tls:
# httpsRedirect: true
- hosts:
- operations.ourdomain.nl
port:
name: https
number: 443
protocol: HTTPS
tls:
credentialName: certs.ourdomain.nl
mode: SIMPLE
privateKey: sds
serverCertificate: sds
This interesting article gives a good insight in how the acme-challenge is supposed to work. For purpose of testing we have removed the port 80 and redirect to https in our custom gateway. We have added the autogenerated k8s gateway, listening only on port 80.
Istio is supposed to create a virtualservice for the acme-challenge. This seems to be happening, because now, when we request the acme-challenge url we get a 503: upstream connect error or disconnect/reset before headers. I believe this means the request gets to the gateway and is matched by a virtualservice, but there is no service / healthy pod to revert the traffic to.
We do see some possibly interesting logging in the istio-pilot:
“ProxyStatus”: {“endpoint_no_pod”: {“cm-acme-http-solver-l5j2g.istio-system.svc.cluster.local”: {“message”: “10.16.57.248”}
I have double checked and the service mentioned above does have a pod it is exposing. So I am not sure whether this line is relevant to this issue.
The acme-challenge pods do not have an istio-sidecar. Could this be the issue? If so: why does it apparently work for others