Terminate istio sidecar istio-proxy for a kubernetes job / cronjob

2/28/2019

We recently started using istio Istio to establish a service-mesh within out Kubernetes landscape.

We now have the problem that jobs and cronjobs do not terminate and keep running forever if we inject the istio istio-proxy sidecar container into them. The istio-proxy should be injected though to establish proper mTLS connections to the services the job needs to talk to and comply with our security regulations.

I also noticed the open issues within Istio (istio/issues/6324) and kubernetes (kubernetes/issues/25908), but both do not seem to provide a valid solution anytime soon.

At first a pre-stop hook seemed suitable to solve this issue, but there is some confusion about this conecpt itself: kubernetes/issues/55807

lifecycle:
  preStop:
    exec:
      command: 
        ...

Bottomline: Those hooks will not be executed if the the container successfully completed.

There are also some relatively new projects on GitHub trying to solve this with a dedicated controller (which I think is the most preferrable approach), but to our team they do not feel mature enough to put them right away into production:

In the meantime, we ourselves ended up with the following workaround that execs into the sidecar and sends a SIGTERM signal, but only if the main container finished successfully:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: terminate-sidecar-example-service-account
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: terminate-sidecar-example-role
rules:
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get","delete"]
  - apiGroups: [""]
    resources: ["pods/exec"]
    verbs: ["create"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: terminate-sidecar-example-rolebinding
subjects:
  - kind: ServiceAccount
    name: terminate-sidecar-example-service-account
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: terminate-sidecar-example-role
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: terminate-sidecar-example-cronjob
  labels:
    app: terminate-sidecar-example
spec:
  schedule: "30 2 * * *"
  jobTemplate:
    metadata:
      labels:
        app: terminate-sidecar-example
    spec:
      template:
        metadata:
          labels:
            app: terminate-sidecar-example
          annotations:
            sidecar.istio.io/inject: "true"
        spec:
          serviceAccountName: terminate-sidecar-example-service-account
          containers:
          - name: ****
            image: ****
            command:
              - "/bin/ash"
              - "-c"
            args:
              - node index.js && kubectl exec -n ${POD_NAMESPACE} ${POD_NAME} -c istio-proxy -- bash -c "sleep 5 && /bin/kill -s TERM 1 &"
            env:
              - name: POD_NAME
                valueFrom:
                  fieldRef:
                    fieldPath: metadata.name
              - name: POD_NAMESPACE
                valueFrom:
                  fieldRef:
                    fieldPath: metadata.namespace

So, the ultimate question to all of you is: Do you know of any better workaround, solution, controller, ... that would be less hacky / more suitable to terminate the istio-proxy container once the main container finished its work?

-- croeck
istio
kubernetes

1 Answer

3/28/2020

This was not a misconfiguration, this was a bug in upstream Kubernetes. As of September of 2019, this has been resolved by Istio by introducing a /quitquitquit endpoint to the Pilot agent.

Unfortunately, Kubernetes has not been so steadfast in solving this issue themselves. So it still does exist in some facets. However, the /quitquitquit endpoint in Istio should have resolved the problem for this specific use case.

-- TJ Zimmerman
Source: StackOverflow