We recently started using istio Istio to establish a service-mesh within out Kubernetes landscape.
We now have the problem that jobs and cronjobs do not terminate and keep running forever if we inject the istio istio-proxy
sidecar container into them. The istio-proxy
should be injected though to establish proper mTLS connections to the services the job needs to talk to and comply with our security regulations.
I also noticed the open issues within Istio (istio/issues/6324) and kubernetes (kubernetes/issues/25908), but both do not seem to provide a valid solution anytime soon.
At first a pre-stop hook seemed suitable to solve this issue, but there is some confusion about this conecpt itself: kubernetes/issues/55807
lifecycle:
preStop:
exec:
command:
...
Bottomline: Those hooks will not be executed if the the container successfully completed.
There are also some relatively new projects on GitHub trying to solve this with a dedicated controller (which I think is the most preferrable approach), but to our team they do not feel mature enough to put them right away into production:
In the meantime, we ourselves ended up with the following workaround that execs into the sidecar and sends a SIGTERM
signal, but only if the main container finished successfully:
apiVersion: v1
kind: ServiceAccount
metadata:
name: terminate-sidecar-example-service-account
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: terminate-sidecar-example-role
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get","delete"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["create"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: terminate-sidecar-example-rolebinding
subjects:
- kind: ServiceAccount
name: terminate-sidecar-example-service-account
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: terminate-sidecar-example-role
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: terminate-sidecar-example-cronjob
labels:
app: terminate-sidecar-example
spec:
schedule: "30 2 * * *"
jobTemplate:
metadata:
labels:
app: terminate-sidecar-example
spec:
template:
metadata:
labels:
app: terminate-sidecar-example
annotations:
sidecar.istio.io/inject: "true"
spec:
serviceAccountName: terminate-sidecar-example-service-account
containers:
- name: ****
image: ****
command:
- "/bin/ash"
- "-c"
args:
- node index.js && kubectl exec -n ${POD_NAMESPACE} ${POD_NAME} -c istio-proxy -- bash -c "sleep 5 && /bin/kill -s TERM 1 &"
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
So, the ultimate question to all of you is: Do you know of any better workaround, solution, controller, ... that would be less hacky / more suitable to terminate the istio-proxy
container once the main container finished its work?
This was not a misconfiguration, this was a bug in upstream Kubernetes. As of September of 2019, this has been resolved by Istio by introducing a /quitquitquit
endpoint to the Pilot agent.
Unfortunately, Kubernetes has not been so steadfast in solving this issue themselves. So it still does exist in some facets. However, the /quitquitquit
endpoint in Istio should have resolved the problem for this specific use case.