LetsEncrypt not verifying via Kubernetes ingress and loadbalancer in AWS EKS

4/5/2020

LetsEncrypt not verifying via Kubernetes ingress and loadbalancer in AWS EKS

ClientIssuer

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: letsencrypt-staging
  namespace: cert-manager
spec:
  acme:
    # The ACME server URL
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    # Email address used for ACME registration
    email: my@email.com
    # Name of a secret used to store the ACME account private key
    privateKeySecretRef:
      name: letsencrypt-staging
    # Enable the HTTP-01 challenge provider
    solvers:
      - http01:
          ingress:
            class:  nginx

Ingress.yaml

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: echo-ingress
  annotations:
    kubernetes.io/ingress.class: "nginx"
    cert-manager.io/cluster-issuer: "letsencrypt-staging"
spec:
  tls:
  - hosts:
    - echo0.site.com
    secretName: echo-tls
  rules:
    - host: echo0.site.com
      http:
        paths:
        - backend:
            serviceName: echo0
            servicePort: 80

Events

12m         Normal    IssuerNotReady         certificaterequest/echo-tls-3171246787   Referenced issuer does not have a Ready status condition
12m         Normal    GeneratedKey           certificate/echo-tls                     Generated a new private key
12m         Normal    Requested              certificate/echo-tls                     Created new CertificateRequest resource "echo-tls-3171246787"
4m29s       Warning   ErrVerifyACMEAccount   clusterissuer/letsencrypt-staging        Failed to verify ACME account: context deadline exceeded
4m29s       Warning   ErrInitIssuer          clusterissuer/letsencrypt-staging        Error initializing issuer: context deadline exceeded

kubectl describe certificate

Name:         echo-tls
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  cert-manager.io/v1alpha3
Kind:         Certificate
Metadata:
  Creation Timestamp:  2020-04-04T23:57:22Z
  Generation:          1
  Owner References:
    API Version:           extensions/v1beta1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  Ingress
    Name:                  echo-ingress
    UID:                   1018290f-d7bc-4f7c-9590-b8924b61c111
  Resource Version:        425968
  Self Link:               /apis/cert-manager.io/v1alpha3/namespaces/default/certificates/echo-tls
  UID:                     0775f965-22dc-4053-a6c2-a87b46b3967c
Spec:
  Dns Names:
    echo0.site.com
  Issuer Ref:
    Group:      cert-manager.io
    Kind:       ClusterIssuer
    Name:       letsencrypt-staging
  Secret Name:  echo-tls
Status:
  Conditions:
    Last Transition Time:  2020-04-04T23:57:22Z
    Message:               Waiting for CertificateRequest "echo-tls-3171246787" to complete
    Reason:                InProgress
    Status:                False
    Type:                  Ready
Events:
  Type    Reason        Age   From          Message
  ----    ------        ----  ----          -------
  Normal  GeneratedKey  18m   cert-manager  Generated a new private key
  Normal  Requested     18m   cert-manager  Created new CertificateRequest resource "echo-tls-3171246787"

Been going at this for a few days now. I have tried with different domains, but end up with same results. Am I missing anything here/steps. It is based off of this tutorial here

Any help would be appreciated.

-- teej2542
amazon-web-services
aws-eks
kubernetes
kubernetes-ingress
lets-encrypt

2 Answers

5/21/2020

This might be worthwhile to look at. I was facing similar issue.

Change LoadBalancer in ingress-nginx service.

Add/Change externalTrafficPolicy: Cluster.

Reason being, pod with the certificate-issuer wound up on a different node than the load balancer did, so it couldn’t talk to itself through the ingress.

Below is complete block taken from https://raw.githubusercontent.com/kubernetes/ingress-nginx/nginx-0.26.1/deploy/static/provider/cloud-generic.yaml

kind: Service
apiVersion: v1
metadata:
  name: ingress-nginx
  namespace: ingress-nginx
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
spec:
  #CHANGE/ADD THIS
  externalTrafficPolicy: Cluster
  type: LoadBalancer
  selector:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
  ports:
    - name: http
      port: 80
      targetPort: http
    - name: https
      port: 443
      targetPort: https

---
-- deepdive
Source: StackOverflow

4/6/2020

Usually with golang applications the error context deadline exceeded means the connection timed out. That sounds like the cert-manager pod was not able to reach the ACME API, which can happen if your cluster has an outbound firewalls, and/or does not have a NAT or Internet Gateway attached to the subnets

-- mdaniel
Source: StackOverflow