How to get GRPC's round_robin load balancing to work using grpc-java in Kubernetes cluster?

1/8/2021

I have been attempting to get GRPC's load balancing working in my Java application deployed to a Kubernetes cluster but I have not been having too much success. There does not seem to be too much documentation around this, but from examples online I can see that I should now be able to use '.defaultLoadBalancingPolicy("round_robin")' when setting up the ManagedChannel (in later versions of GRPC Java lib).

To be more specific, I am using version 1.34.1 of the GRPC Java libraries. I have created two Spring Boot (v2.3.4) applications, one called grpc-sender and one called grpc-receiver.

grpc-sender acts as a GRPC client and defines a (Netty) ManagedChannel as:

@Bean
public ManagedChannel greetingServiceManagedChannel() {
  String host = "grpc-receiver";
  int port = 6565;
  return NettyChannelBuilder.forAddress(host, port)
      .defaultLoadBalancingPolicy("round_robin")
      .usePlaintext().build();
}

Then grpc-receiver acts as the GRPC server:

Server server = ServerBuilder.forPort(6565)
        .addService(new GreetingServiceImpl()).build();

I am deploying these applications to a Kubernetes cluster (running locally in minikube for the time being), and I have created a Service for the grpc-receiver application as a headless service:

kind: Service
apiVersion: v1
metadata:
  name: grpc-receiver
spec:
  clusterIP: None
  selector:
    app: grpc-receiver
  ports:
    - name: 'grpc'
      port: 6565
      protocol: 'TCP'
      targetPort: 6565

However when I attempt to send a message from grpc-sender to grpc-receiver I just see this exception in the grpc-sender logs:

2021-01-08 17:46:24.494 ERROR 1 --- [ault-executor-0] io.grpc.internal.ManagedChannelImpl      : [Channel<1>: (grpc-receiver:6565)] Uncaught exception in the SynchronizationContext. Panic!
java.lang.NoSuchFieldError: NAME_RESOLVER_SERVICE_CONFIG
	at io.grpc.services.HealthCheckingLoadBalancerFactory$HealthCheckingLoadBalancer.handleResolvedAddresses(HealthCheckingLoadBalancerFactory.java:186) ~[grpc-services-1.25.0.jar!/:1.25.0]
	at io.grpc.internal.AutoConfiguredLoadBalancerFactory$AutoConfiguredLoadBalancer.tryHandleResolvedAddresses(AutoConfiguredLoadBalancerFactory.java:154) ~[grpc-core-1.34.1.jar!/:1.34.1]
	at io.grpc.internal.ManagedChannelImpl$NameResolverListener$1NamesResolved.run(ManagedChannelImpl.java:1668) ~[grpc-core-1.34.1.jar!/:1.34.1]
	at io.grpc.SynchronizationContext.drain(SynchronizationContext.java:95) ~[grpc-api-1.34.1.jar!/:1.34.1]
	at io.grpc.SynchronizationContext.execute(SynchronizationContext.java:127) ~[grpc-api-1.34.1.jar!/:1.34.1]
	at io.grpc.internal.ManagedChannelImpl$NameResolverListener.onResult(ManagedChannelImpl.java:1682) ~[grpc-core-1.34.1.jar!/:1.34.1]
	at io.grpc.internal.DnsNameResolver$Resolve.run(DnsNameResolver.java:333) ~[grpc-core-1.34.1.jar!/:1.34.1]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[na:na]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[na:na]
	at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na

Could anyone give me any pointers as to what I am doing wrong or what I am missing?

By the way - I am aware of alternative approaches to load balancing with GRPC such as with a Service Mesh such as Linkerd or Istio or just using Envoy proxy, but I am keen to get something working using GRPC's out of the box load balancing features as a point of comparison between the different approaches.

Thanks very much!

-- Daniel Western
grpc
grpc-java
java
kubernetes
load-balancing

1 Answer

1/10/2021

I found some issue of your code: 1. In client, you should use forTarget rather than forAddress like dns:///grpc-receiver:6565, because kubernetes service route by dns, have a try for this name resolver, and make sure your server has multiple instances. 2. The exception NoSuchFieldError: NAME_RESOLVER_SERVICE_CONFIG is caused by lib not match, the grpc-services version is 1.25.0, but grpc-core version is 1.34.1, so make them in same version, and it should be worked correcttly.

-- HelloWood
Source: StackOverflow