What is the purpose of a kubernetes deployment pod selector?

5/12/2018

I fail to see why kubernetes need a pod selector in a deployment statement that can only contain one pod template? Feel free to educate me why kubernetes engineers introduced a selector statement inside a deployment definition instead of automatically select the pod from the template?

---
apiVersion: v1
kind: Service
metadata:
  name: grpc-service

spec:
  type: LoadBalancer
  ports:
  - name: grpc
    port: 8080
    targetPort: 8080
    protocol: TCP
  selector:
    app: grpc-test

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: grpc-deployment

spec:
  replicas: 1
  revisionHistoryLimit: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0

  selector:
    matchLabels:
      app: grpc-test

  template:
    metadata:
      labels:
        app: grpc-test

    spec:
      containers:
      ...

Why not simply define something like this?

---
apiVersion: v1
kind: Service
metadata:
  name: grpc-service

spec:
  type: LoadBalancer
  ports:
  - name: grpc
    port: 8080
    targetPort: 8080
    protocol: TCP
  selector:
    app: grpc-test

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: grpc-deployment

spec:
  replicas: 1
  revisionHistoryLimit: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0

  template:
    metadata:
      labels:
        app: grpc-test

    spec:
      containers:
      ...
-- Gert Cuykens
kubernetes

3 Answers

5/12/2018

Ah! Funny enough, I have once tried wrapping my head around the concept of label selectors as well before. So, here it goes...

First of all, what the hell are these labels used for? Labels within kubernetes are the core means of identifying objects. A controller controls pods based on their label instead of their name. In this particular case they are for example meant to identify the pods belonging to the deployment’s replica set.

You actually didn’t have to implicitly define .spec.selector when using the v1beta1 extensions. It would in that case default from .spec.template.labels. However, if you don’t, you can run into problems with kubectl apply once one or more of the labels that are used for selecting change because kubeclt apply will look at kubectl.kubernetes.io/last-applied-configuration when comparing changes and that annotation will only contain the user input when he created the resource and none of the defaulted fields. You’ll get an error because it cannot calculate the diff like:

spec.template.metadata.labels: Invalid value: {"app":"nginx"}: `selector` does not match template `labels`

As you can see, this is a pretty big shortcoming since it means you can not change any of the labels that are being used as a selector label or it would completely break your deployment flow. It was “fixed” in apps/v1beta2 by requiring selectors to be explicitly defined, disallowing mutation on those fields.

So in your example, you actually don’t have to define them! The creation will work and will use your .spec.template.labels by default. But yeah, in the near future when you have to use v1beta2, the field will be mandatory. I hope this kind of answers your question and I didn’t make it any more confusing ;)

-- Toon Lamberigts
Source: StackOverflow

5/12/2018

As far as I know, the selector in the deployment is an optional property.

The template is the only required field of spec.

So, you don't need the use the label selector in the deployment, and in you're example I don't see why you couldn't use the latter part?

-- Tomislav Mikulin
Source: StackOverflow

11/26/2019

However, if you don’t, you can run into problems with kubectl apply once one or more of the labels that are used for selecting change because kubeclt apply will look at kubectl.kubernetes.io/last-applied-configuration when comparing changes and that annotation will only contain the user input when he created the resource and none of the defaulted fields.

Quoting from Toon's answer.

My interpretation is it's not logically necessary at all. It's only due to the limitation of the current implementation of Kubernetes, that it has some weird "behavior" in that the functionality it uses to "compare" two deployments / objects does not take into account "default values".

-- Chris Stryczynski
Source: StackOverflow