I'm currently testing Traefik in Kubernetes to request and manage over 2.000 Let'sEncrypt certificates.
Created 2 test environments:
Traefik with acme.json for certificate storage
Traefik HA cluster with custom build Consul (to store more than 512k per KV) as KV store
Both setups work really well with up to 400 certificates and I find it really convenient to use it instead of CertBot. After 400 certificates Traefik starts to consume a lot of CPU (100%) and many times it just hangs to process any new certificates until pods are restarted. Also, with any new certificate request, it takes up to 5 minutes to process it and during this process, it becomes slower (sometimes ~20sec to resolve SSL).
I'm also considering using multiple ingress classes in Kubernetes and to limit the number of certificates per class (maybe ~300/class) but this will add some complexity to my final setup and I prefer to avoid it.
Does anyone have experience using Traefik with a large number of certificates? I would greatly appreciate any suggestion.
400 certificates is a pretty large number for a given Kubernetes cluster, I would recommend having some another entity, other than Traefik manage your certificates, such as the cert-manager