Kubernetes: customizing Pod scheduling and Volume scheduling

11/11/2018

I'm trying to use Kubernetes to manage a scenario where I need to run several instances of an application (that is, several Pods). These are my requirements:

  1. When I need to scale up my application, I want to deploy one single Pod on a specific Node (not a random one).
  2. When I need to scale down my application, I want to remove a specific Pod from a specific Node (not a random one).
  3. When a new Pod is deployed, I want it to mount a specific PersistentVolume (not a random one) that I have manually provisioned.
  4. After a Pod has been deleted, I want its PersistentVolume to be re-usable by a different Pod.

So far, I used this naive solution to do all of the above: every time I needed to create a new instance of my application, I created one new Deployment (with exactly one replica) and one PersistentVolumeClaim. So for example, if I need five instances of my application, then I need five Deployments. Though, this solution is not very scalable and it doesn't exploit the full potential of Kubernetes.

I think it would be a lot smarter to use one single template for all the Pods, but I'm not sure whether I should use a Deployment or a Statefulset.

I've been experimenting with Labels and Node Affinity, and I found out that I can satisfy requirement 1, but I cannot satisfy requirement 2 this way. In order to satisfy requirement 2, would it be possible to delete a specific Pod by writing my own custom scheduler?

I don't understand how Kubernetes decides to tie a specific PersistentVolume to a specific PersistentVolumeClaim. Is there a sort of volume scheduler? Can I customize it somehow? This way, every time a new Pod is created, I'd be able to tie it to a specific volume.

-- MikiTesi
kubernetes
persistent
scheduling
volume

1 Answer

11/12/2018

There may be a good reason for these requirements so I'm not going to try to convince you that it may not be a good idea to use Kubernetes for this...

Yes - with nodeSelector using labels, node affinity, and anti-affinity rules, pods can be scheduled on "appropriate" nodes.

Static Pods may be something close to what you are looking for. I've never used static pods/bare pods on Kubernetes...they kind of don't (to quote something from the question) "...exploit the full potential of Kubernetes" ;-)

Otherwise, here is what I think will work with out-of-the-box constructs for the four requirements:

Use Deployment like you have - this will give you requirements #1 and #2. I don't believe requirement #2 (nor #1, actually) can be satisfied with StatefulSet. Neither with a ReplicaSet.

Use statically provisioned PVs and selector(s) to (quote) "...tie a specific PersistentVolume to a specific PersistentVolumeClaim" for requirement #3.

Then requirement #4 will be possible - just make sure the PVs use the proper reclaim policy.

-- apisim
Source: StackOverflow