I have deployed a TF model on both TF-Serving and Nvidia TensorRT Inference Server with gRPC endpoints on GKE. Both are running fine.
I want to load test both of my deployments with gRPC endpoints. Are there any options to do load-testing for gRPC like JMeter and Locust?
Also, I found a tool called 'ghz'. It needs a proto file for my payload. How can we save a .proto file for numpy images?