THE PROBLEM
I am working on a Django deployment with Kubernetes. The application is working perfectly when run using docker-compose (including automatic TLS certificate generation).
However, when deployed with Kubernetes I run into an issue that seems to be related to Kubernetes read-only file system.
The current state of development is that the Django app is up and running and ingress on the cluster is working (the problem is not related to the Django app). The problem is that with automatic certificate generation, Kubernetes' read-only file system is preventing acme.json from being modified leaving me with an empty acme.json and without a certificate being generated.
The above conclusion is derived form the following: When I check the logs (kubectl logs pod/traefik-12345 -n stonecap
) I find the following error (see the last line):
...
time="2022-02-01T21:11:28Z" level=info msg="Testing certificate renew..." providerName=letsencrypt.acme
time="2022-02-01T21:11:28Z" level=info msg="Starting provider *traefik.Provider {}"
time="2022-02-01T21:11:31Z" level=info msg=Register... providerName=letsencrypt.acme
time="2022-02-01T21:11:31Z" level=error msg="open /etc/traefik/acme/acme.json: read-only file system" providerName=acme
...
RELEVANT FILES
This is the Traefik controller deployment (the service behind it is irrelevant).
kind: Deployment
apiVersion: apps/v1
metadata:
namespace: stonecap
name: traefik
labels:
app: traefik
service: traefik
spec:
replicas: 1
selector:
matchLabels:
app: traefik
template:
metadata:
labels:
app: traefik
spec:
containers:
- name: traefik
image: myrepo/traefik_k8:1.0
volumeMounts:
- mountPath: /etc/traefik
name: all-in-one
args:
- --configfile=/etc/traefik/traefik.yml
- --api
- --kubernetes
- --logLevel=INFO
volumes:
- name: all-in-one
projected:
sources:
- configMap:
name: traefik-config
items:
- key: traefik.yml
path: traefik.yml
- configMap:
name: traefik-acme
items:
- key: acme.json
path: acme/acme.json
mode: 0600
The configmap:
kind: ConfigMap
apiVersion: v1
metadata:
namespace: stonecap
name: traefik-config
data:
traefik.yml: |-
log:
level: INFO
entryPoints:
web:
# http
address: ":80"
web-secure:
# https
address: ":443"
certificatesResolvers:
letsencrypt:
# https://docs.traefik.io/master/https/acme/#lets-encrypt
acme:
email: "info@mydopedomain.com"
storage: /etc/traefik/acme/acme.json
# https://docs.traefik.io/master/https/acme/#httpchallenge
httpChallenge:
entryPoint: web
http:
routers:
web-router:
rule: "Host(`mydopedomain.com`) || Host(`www.mydopedomain.com`)"
entryPoints:
- web
middlewares:
- redirect
- csrf
service: django
web-secure-router:
rule: "Host(`mydopedomain.com`) || Host(`www.mydopedomain.com`)"
entryPoints:
- web-secure
middlewares:
- csrf
service: django
tls:
# https://docs.traefik.io/master/routing/routers/#certresolver
certResolver: letsencrypt
middlewares:
redirect:
# https://docs.traefik.io/master/middlewares/redirectscheme/
redirectScheme:
scheme: https
permanent: true
csrf:
headers:
hostsProxyHeaders: ["X-CSRFToken"]
services:
django:
loadBalancer:
servers:
- url: http://django:5000
providers:
# https://docs.traefik.io/master/providers/file/
file:
filename: /etc/traefik/traefik.yml
watch: true
The acme configmap:
kind: ConfigMap
apiVersion: v1
metadata:
namespace: stonecap
name: traefik-acme
data:
acme.json: ""
The traefik Dockerfile contains:
FROM traefik:v2.2.11
RUN mkdir -p /etc/traefik/acme \
&& touch /etc/traefik/acme/acme.json \
&& chmod 600 /etc/traefik/acme/acme.json
COPY ./compose/production/traefik/traefik.yml /etc/traefik
FIX ATTEMPTS
I have tried different suggestions without any success.
Per this question, adding readOnly: false
does not help:
volumeMounts:
- mountPath: /etc/traefik
name: all-in-one
readOnly: false
Adding the below, based on this request (deprecated) does not help.
securityContext:
readOnlyRootFilesystem: False
runAsGroup: 0
runAsUser: 0
runAsNonRoot: false
The answer to this question is not applicable since my case concerns modifying an existing file. In this context I don't believe (feel free to correct me) this can be fixed using an init container (See issue concerning ConfigMaps).
My kubectl version is:
$ kubectl version --short
Client Version: v1.21.0
Server Version: v1.21.5-eks-bc4871b
QUESTION
To reiterate, the problem here is that I am unable to auto generate certificates because Kubernetes is preventing me from modifying a file. Therefore, my question is:
Thank you for your time in reading this.
EDIT
The cluster was deployed on Amazon EKS. eksctl (version 0.80.0) was used to build the cluster. The command used is:
eksctl create cluster --version 1.21 --name "$clusterName" --region "$AWS_DEFAULT_REGION"
This deploys 2 nodes by default (Managed nodes – Linux).