Is there a way to force services deployed using Google Cloud Run for Anthos (hosted on GKE) to be scheduled to node pools that have a GPU?
I created a Kubernetes cluster by going to Kubernetes -> Create Cluster -> GPU Accelerated Computing
. This created a Kubernetes cluster with a gpu-pool-1
node pool, containing nodes with a GPU, and a standard-pool-1
node pool, containing nodes without a GPU.
Is there a way I can deploy Cloud Run containers to nodes having a GPU? Maybe by configuring a custom namespace or something?
Note that there is a similar question from close to a year ago, but I do not think that the accepted answer ("Cloud Run on Kubernetes does not support GPUs") is entirely correct.
There does seem to be a way to get this to work, at least in a hacky way, as described here.
The knative Service
configuration file does seem to accept and respect the limits: nvidia.com/gpu: 1
argument. While the Cloud Run interface does not allow us to specify this argument ourselves, we can use the kubectl
CLI to manually deploy knative services defined by a yaml file which contains this argument.
First, we need to create a GKE cluster with a cpu node pool, a gpu node pool, and Cloud Run for Anthos enabled. This can be accomplished by going to Kubernetes Engine -> Create Cluster -> Selecting "GPU Accelerated Computing" on the left cluster templates bar -> Checking the "Enable Cloud Run for Anthos"
. Once the cluster is created, we can click the "connect" button and start a cloud shell. Here, we can create a service.yaml
file which defines our knative service. For example, we can adapt the service.yaml
file from the knative documentation but specify that this service requires a GPU:
# service.yaml
apiVersion: serving.knative.dev/v1 # Current version of Knative
kind: Service
metadata:
name: helloworld-go # The name of the app
namespace: default # The namespace the app will use
spec:
template:
spec:
containers:
- image: gcr.io/knative-samples/helloworld-go # The URL to the image of the app
env:
- name: TARGET # The environment variable printed out by the sample app
value: "Go Sample v1"
resources:
limits:
nvidia.com/gpu: 1 # The service must be run on a machine with at least one GPU
We can deploy this service using:
kubectl apply -f service.yaml
and check its status using:
kubectl get ksvc helloworld-go
The helloworld-go
service should only be scheduled on nodes containing a GPU. The service should show up on the Cloud Run dashboard just like the other Cloud Run for Anthos services.
This is an hot topic on Knative serving development.
Not possible for now to have node selector and toleration when your pods is spawn with Knative serving, but the team is working on a solution.