Istio: Ingress for ACME-challenge not working (503)

4/16/2019

We are running Istio 1.1.3 on 1.12.5-gke.10 cluster-nodes.

We use certmanager for managing our let's encrypt certificates.

apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
  name: certs.ourdomain.nl
  namespace: istio-system
spec:
  secretName: certs.ourdomain.nl
  newBefore: 360h # 15d
  commonName: operations.ourdomain.nl
  dnsNames:
    - operations.ourdomain.nl
  issuerRef:
    name: letsencrypt
    kind: ClusterIssuer
  acme:
    config:
      - http01:
          ingressClass: istio
        domains:
        - operations.ourdomain.nl

Next thing we see the acme backend, service (nodeport and ingress) deployed. The ingress (auto-generated) looks like this:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: istio
  generateName: cm-acme-http-solver-
  generation: 1
  labels:
    certmanager.k8s.io/acme-http-domain: "1734084804"
    certmanager.k8s.io/acme-http-token: "1476005735"
  name: cm-acme-http-solver-69vzw
  namespace: istio-system
  ownerReferences:
  - apiVersion: certmanager.k8s.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: Certificate
    name: certs.ourdomain.nl
    uid: 751011d2-4fc8-11e9-b20e-42010aa40101
spec:
  rules:
  - host: operations.ourdomain.nl
    http:
      paths:
      - backend:
          serviceName: cm-acme-http-solver-fzk8q
          servicePort: 8089
        path: /.well-known/acme-challenge/dnrcr-LRRMdXhBaUefjqpHQx8ytYuk-feEfXu9gW-Ck
status:
  loadBalancer: {}

However, when we try to access the url operations.ourdomain.nl /.well-known/acme-challenge/dnrcr-LRRMdXhBaUefjqpHQx8ytYuk-feEfXu9gW-Ck we get a 404.

We do have a loadbalancer for istio:

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  labels:
    app: istio-ingress
    chart: gateways-1.1.0
    heritage: Tiller
    istio: ingress
    release: istio
  name: istio-ingress
  namespace: istio-system
spec:
  selector:
    app: istio-ingress
  servers:
  - hosts:
    - operations.ourdomain.nl
    #port:
    #  name: http
    #  number: 80
    #  protocol: HTTP
    #tls:
    #  httpsRedirect: true
  - hosts:
    - operations.ourdomain.nl
    port:
      name: https
      number: 443
      protocol: HTTPS
    tls:
      credentialName: certs.ourdomain.nl
      mode: SIMPLE
      privateKey: sds
     serverCertificate: sds

This interesting article gives a good insight in how the acme-challenge is supposed to work. For purpose of testing we have removed the port 80 and redirect to https in our custom gateway. We have added the autogenerated k8s gateway, listening only on port 80.

Istio is supposed to create a virtualservice for the acme-challenge. This seems to be happening, because now, when we request the acme-challenge url we get a 503: upstream connect error or disconnect/reset before headers. I believe this means the request gets to the gateway and is matched by a virtualservice, but there is no service / healthy pod to revert the traffic to.

We do see some possibly interesting logging in the istio-pilot:

“ProxyStatus”: {“endpoint_no_pod”: {“cm-acme-http-solver-l5j2g.istio-system.svc.cluster.local”: {“message”: “10.16.57.248”}

I have double checked and the service mentioned above does have a pod it is exposing. So I am not sure whether this line is relevant to this issue.

The acme-challenge pods do not have an istio-sidecar. Could this be the issue? If so: why does it apparently work for others

-- Robert van der Spek
google-cloud-platform
google-kubernetes-engine
istio
kubernetes
kubernetes-ingress

0 Answers