This is now the fourth time I set up a kubernetes cluster. It's always the same setup: basic k8s, traefik as reverse proxy, dashboard, prometheus, elk-stack. But this time something with the traefik deployment is odd...
So for all other clusters I just deployed my default setup with some rbac entries, a config map containing the toml file, the actual deployment, a service and the web-ui:
RBAC:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: traefik-ingress-controller
namespace: infra
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: traefik-ingress-controller
rules:
- apiGroups:
- ""
resources:
- services
- endpoints
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: traefik-ingress-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: traefik-ingress-controller
subjects:
- kind: ServiceAccount
name: traefik-ingress-controller
namespace: infra
ConfigMap:
---
apiVersion: v1
kind: ConfigMap
metadata:
name: traefik-toml
labels:
name: traefik-toml
namespace: infra
data:
traefik.toml: |-
defaultEntryPoints = ["http","https"]
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.http.redirect]
entryPoint = "https"
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/external/<EXTERNAL_URL>.crt"
KeyFile = "/ssl/external/<EXTERNAL_URL>.key"
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/internal/<INTERNAL_URL>.crt"
KeyFile = "/ssl/internal/<INTERNAL_URL>.key"
[accessLog]
Deployment
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: traefik-ingress-controller
namespace: infra
labels:
k8s-app: traefik-ingress-lb
spec:
replicas: 1
selector:
matchLabels:
k8s-app: traefik-ingress-lb
template:
metadata:
labels:
k8s-app: traefik-ingress-lb
name: traefik-ingress-lb
spec:
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 60
containers:
- image: traefik:v1.6.5
name: traefik-ingress-lb
volumeMounts:
- mountPath: /ssl/external
name: ssl-external
- mountPath: /ssl/internal
name: ssl-internal
- name: traefik-toml
subPath: traefik.toml
mountPath: /config/traefik.toml
ports:
- name: http
containerPort: 80
- name: https
containerPort: 443
- name: admin
containerPort: 8080
args:
- --configfile=/config/traefik.toml
- --api
- --kubernetes
- --logLevel=INFO
volumes:
- name: ssl-external
secret:
secretName: <EXTERNAL_URL>.cert
- name: ssl-internal
secret:
secretName: <INTERNAL_URL>.cert
- name: traefik-toml
configMap:
name: traefik-toml
Service:
---
kind: Service
apiVersion: v1
metadata:
name: traefik-ingress-service
namespace: infra
spec:
selector:
k8s-app: traefik-ingress-lb
ports:
- protocol: TCP
port: 80
name: web
- protocol: TCP
port: 443
name: sweb
externalIPs:
- <WORKER IP 1>
- <WORKER IP 2>
This works nicely for the other ones, but on the new one (where I did not setup kubernetes myself), there is the following error in the logs every 30 seconds (the Error checking new version not that often!):
E0827 14:29:49.566294 1 reflector.go:205] github.com/containous/traefik/vendor/k8s.io/client-go/informers/factory.go:86: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E0827 14:29:49.572633 1 reflector.go:205] github.com/containous/traefik/vendor/k8s.io/client-go/informers/factory.go:86: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E0827 14:29:49.592844 1 reflector.go:205] github.com/containous/traefik/vendor/k8s.io/client-go/informers/factory.go:86: Failed to list *v1beta1.Ingress: Get https://10.96.0.1:443/apis/extensions/v1beta1/ingresses?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
time="2018-08-27T14:30:00Z" level=warning msg="Error checking new version: Get https://update.traefik.io/repos/containous/traefik/releases: dial tcp: i/o timeout"
Has someone any idea? Is this a known issue? I could not find any known problems on this topic..
Thanks in advance!
I managed to fix the problem:
The problem was a faulty iptables FORWARD policy, that is set by newer docker engines: https://github.com/moby/moby/issues/35777
Currently we have a workaround, that is steadily setting the policy back to ACCEPT.
If we have a real fix I will hopefully remember to come back here and post it :)