I am learning kubernetes for a few weeks now but I am struggling with some design concepts and some basic questions came up.
I first tried docker-compose, than building docker images via Dockerfiles, stumbled over helm and kubectl. And now I came across building pods and building deployments. Now I know many different things but a real life example or some best practice knowledge would be appreciated. Google is great.. but it seems there is not just one way.
I understand, that pods should be able to be easily replaced / destroyed /recreated ...
Is it better to have a POD configuration like - nginx container - php container - mysql container - redis container edit: as I just read, pods share an IP-Adress so it would make no sense to inlcude mysql or redis here, right?
or better one pod with a - mysql container and one pod with containers for - nginx - php
and another with a - redis container
The content of the local webroot comes from a git repo.
I can create a yaml file for defining the containers inside my pod (type:pod). But I also can define a deployment.yaml (type:deployment).
Do I have to reference my pod.yaml inside my deployment.yaml or does the deployment includes all pod configuration and replaces the pod.yaml?
Regarding Pods. You can create one pod with everything you need. But that will be very fat pod. Remember, pod runs only on one node, it is not possible to run one pod partially on one node, and partially on another. One pod runs only on one node. That means from scalability standpoint many small pods are better than one big. Many small pods also generally provide more uniform resource and load distribution between nodes.
Also when you update one container in pod - the whole pod gets restarted. So if you have application and database in the same pod - if you update app code - database will also be restarted. Not cool, eh?
But in some cases running several containers in one pod may be reasonable. Remember, all containers in pod share network address and localhost. So containers within pod have very low network latency.
Also containers within pod can share volumes between each other. That is also important in some cases.
Persistent volumes You cannot mount a Git repo into pod. Well, at least that's not what you should do. You should pack your webroot into Docker image and run that in Kubernetes. And this should be done by Jenkins which can build on commit.
Alternatively you can place your files onto shared persistent volume if you want to share files between deployment replicas. That is also possible, you must find so called ReadWriteMany volumes like NFS or GlusterFS that can be shared between multiple pods.
Deployment via config file (eg. stage.yaml, live.yaml) etc.
I've found Helm to work well for this. A Helm "chart" can be deployed with a corresponding set of "values" in a YAML file, and these can be used to configure various parts of the overall deployment.
One useful part of Helm is that there is a standard library of charts. Where you say you need MySQL, you can helm install stable/mysql
and get a pre-packaged installation without worrying about the specific details of stateful sets, persistent volumes, etc.
You'd package everything you suggest here into a single chart, which would have multiple (templated) YAML files for the different Kubernetes parts.
Handling of type: pod vs. type:deployment
A deployment will create some (configurable) number of identical copies of a pod. The pod spec inside the deployment spec contains all of the details it needs. The deployment YAML replaces an existing pod YAML.
You generally don't directly create pods. The upgrade lifecycle in particular can be a little tricky to do by hand, and deployments do all the hard work for you.
Is it better to have a POD configuration like...
Remember that the general operation of things is that a deployment will create some number of copies of a pod. When you have an updated version of the software, you'd push it to a Docker image repository and change the image tag in the deployment spec. Kubernetes will launch additional copies of the pod with the new pod spec, then destroy the old ones.
The two fundamental rules here:
If the components' lifecycles are different, they need to be in different deployments. For example, you don't want to destroy the database when you update code, so these need to be in separate deployments.
If the number of replicas are different, they need to be in different deployments. Your main service might need 3 or 5 replicas, depending on load; nginx just routes HTTP messages around and might only need 1 or 3; the databases can't be replicated and can only use 1.
In the setup you show, I'd have four separate deployments, one each for MySQL, Redis, the nginx proxy, and the main application.
The content of the webroot comes from a git repo.
The easiest way is to build it into an image, probably the nginx image.
If it's "large" (gigabytes in size) you might find it more useful to just host this static content somewhere outside Kubernetes entirely. Anything that has static file hosting will work fine.
There's not to my knowledge a straightforward way to copy arbitrary content into a persistent volume without writing a container to do it.
Your question doesn't mention Kubernetes services at all. These are core, and you should read up on them. In particular where your application talks to the two data stores, it would refer to the service and not the MySQL pod directly.
Depending on your environment, also consider the possibility of hosting the databases outside of Kubernetes. Their lifecycle is very different from your application containers: you never want to stop the database and you really don't want the database's managed stored to be accidentally deleted. You may find it easier and safer to use a bare-metal database setup, or to use a hosted database setup. (My personal experience is mostly with AWS, and you could use RDS for a MySQL instance, Elasticache for Redis, and S3 for the static file hosting discussed above.)