How to Enable RequestedToCapacityRatio feature for Kubernetes Cluster (version 1.18)

10/9/2021

We have a kubernetes cluster (version 1.18.x) running on ubuntu 18.04, and we mainly use this cluster to run AI jobs.

We want the cluster to schedule jobs based on bin packing policy (nvidia gpu resources have the highest weight), and I have done this as told by this article. But after I do all the staff, the pod can't be scheduled anymore, it always stucks in Pending!!

Our command to run the scheduler is as below:

/opt/kube/bin/kube-scheduler --address=127.0.0.1 --kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig --leader-elect=true --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 --tls-min-version=VersionTLS12 --v=2 --config=/path/to/my_policy_file.yaml 

Command to restart kube-scheduler:

systemctl daemon-reload
systemctl stop kube-scheduler
systemctl start kube-scheduler

My Policy file:

apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration
leaderElection:
  leaderElect: true
clientConnection:
  kubeconfig: /etc/kubernetes/kube-scheduler.kubeconfig
profiles:
  - schedulerName: kube-scheduler
    plugins:
      score:
        enabled:
        - name: RequestedToCapacityRatio
          weight: 100
    pluginConfig:
    - name: RequestedToCapacityRatio
      args:
        shape:
        - utilization: 0
          score: 0
        - utilization: 100
          score: 10
        resources: 
        - name:  cpu
          weight: 1
        - name:  nvidia.com/gpu
          weight: 100

But after I apply this file to the default scheduler, it can't schedule pods anymore. Pod always stucks in pending. here is the yaml file I use to test:

apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  containers:
  - command: ["/bin/bash", "-c", "sleep", "3600"]
    image: ubuntu:18.04
    name: test


              
                    

So how to correctlly turn on binpack feature? Why the job can't be scheduled?

-- Wallace
bin-packing
kube-scheduler
kubernetes
scheduling

1 Answer

10/10/2021

I solved this issue. I typed a wrong scheduler name. The name of default scheduler of kubernetes is default-scheduler, other than kube-scheduler.

-- Wallace
Source: StackOverflow