Running and inspecting sanity of deployments on Kubernetes

10/6/2020

TL;DR How can it be periodically checked that deployments configuration loading and integration with external/other services is sane on production clusters?

Long version:
Production K8s cluster of course could contain complicated deployments and eventually pods could fail for basically two reasons:

  1. Some K8s objects were changed.
  2. Problems accessing infrastructure/other services.

Some more specific examples I can think of: 1. Invalid mounts of secrets/config mounts 2. Missing env vars 3. Assigning non existent service account 4. Network issues(within the cluster or with the "outer world") 5. External dependency like DB are down or under heavy load.

Is there some simple tool(or any other way) to perform such tests from within the cluster? Such tool should provide the following apis: 1. API to tell K8s to "Run periodically this deployment with some other command(otherwise same deployment spec)" 2. API to inspect tests results - ui or even just kubectl get SOME_SIMPLE_TEST_RESULTS_CRD.

Other command basically refers to system init without actually running anything, but it can be any command to check that configuration related stuff(env vars, mounts etc.) were loaded well and infrastructure is reachable.

Searching for such tools, closest I got were cronjob, helm tests and init containers. However, all lack proper visibility or provide testing before the applying the deployment, but nothing more.

Additionally, looking for similar questions I have found this one to be the closest and yet it seems quiet different:<br/> https://stackoverflow.com/questions/58536038/running-integration-e2e-tests-on-top-of-a-kubernetes-stack

-- user14402384
integration-testing
kubernetes
testing

1 Answer

10/7/2020

Kubernetes does not provide this by default and to accomplish it you will need to use external tools to ensure the desired configuration is applied, monitoring and enforcing it.

That is a wide discussion, and everithing will depends of how your cluster is configured, how you allow the access to your cluster etc...

But, It is common to use some CI/CD pipeline to apply the configuration times to times, or when you need to alter something.

You can also use the kubernetes auditing to verify what happened in your cluster and maybe create some alert or actions to trigger a re-apply of your resources.

-- Mr.KoopaKiller
Source: StackOverflow