Force Google Cloud Run containers to be scheduled on nodes with a GPU

1/21/2020

Is there a way to force services deployed using Google Cloud Run for Anthos (hosted on GKE) to be scheduled to node pools that have a GPU?

I created a Kubernetes cluster by going to Kubernetes -> Create Cluster -> GPU Accelerated Computing. This created a Kubernetes cluster with a gpu-pool-1 node pool, containing nodes with a GPU, and a standard-pool-1 node pool, containing nodes without a GPU.

Is there a way I can deploy Cloud Run containers to nodes having a GPU? Maybe by configuring a custom namespace or something?


Note that there is a similar question from close to a year ago, but I do not think that the accepted answer ("Cloud Run on Kubernetes does not support GPUs") is entirely correct.

-- ostrokach
google-cloud-run
google-kubernetes-engine
knative-serving

2 Answers

1/22/2020

There does seem to be a way to get this to work, at least in a hacky way, as described here.

The knative Service configuration file does seem to accept and respect the limits: nvidia.com/gpu: 1 argument. While the Cloud Run interface does not allow us to specify this argument ourselves, we can use the kubectl CLI to manually deploy knative services defined by a yaml file which contains this argument.

First, we need to create a GKE cluster with a cpu node pool, a gpu node pool, and Cloud Run for Anthos enabled. This can be accomplished by going to Kubernetes Engine -> Create Cluster -> Selecting "GPU Accelerated Computing" on the left cluster templates bar -> Checking the "Enable Cloud Run for Anthos". Once the cluster is created, we can click the "connect" button and start a cloud shell. Here, we can create a service.yaml file which defines our knative service. For example, we can adapt the service.yaml file from the knative documentation but specify that this service requires a GPU:

# service.yaml
apiVersion: serving.knative.dev/v1 # Current version of Knative
kind: Service
metadata:
  name: helloworld-go # The name of the app
  namespace: default # The namespace the app will use
spec:
  template:
    spec:
      containers:
        - image: gcr.io/knative-samples/helloworld-go # The URL to the image of the app
          env:
            - name: TARGET # The environment variable printed out by the sample app
              value: "Go Sample v1"
          resources:
            limits:
              nvidia.com/gpu: 1 # The service must be run on a machine with at least one GPU

We can deploy this service using:

kubectl apply -f service.yaml

and check its status using:

kubectl get ksvc helloworld-go

The helloworld-go service should only be scheduled on nodes containing a GPU. The service should show up on the Cloud Run dashboard just like the other Cloud Run for Anthos services.

-- ostrokach
Source: StackOverflow

1/22/2020

This is an hot topic on Knative serving development.

Not possible for now to have node selector and toleration when your pods is spawn with Knative serving, but the team is working on a solution.

-- guillaume blaquiere
Source: StackOverflow