Let's take a pretty common setup
K8s with istio
ingress proxy (envoy or nginx)
and a service A that has envoy proxies at a per pod level for service A.
Now let's take an example where we have an external client that wants to communicate with service A. The protocol that the client talks is agnostic (http rest/grpc/... doesn't matter). While at a protocol level we could decide something that is somewhat efficient that would talk between the proxy and the client, how can we make proxy<-> service A's pod's envoy sidecar communication more efficient.
The general paradigm for a req/rep model is to open and close a socket per request. Each time this requires a tcp handshake, an http tear up/down which of course if done at scale can cause a non-trivial cpu spike over something that is really mundane for this kind of task. Internally, between services, we can using something like grpc which operates on http2, which can keep that layer of the connection open; thus reducing cpu requirements for req/rep. In the scenario I am talking about we are referring to the flow from client - to a router/proxy - to the front facing service. The router/proxy will terminate and handle the ssl encrypt/decrypt before it round robins to the corresponding pod in service A (via a sidecar envoy proxy). Since this proxy terminates, how we make the connection between the ingress router/proxy and the pod in service A more scalable. Is there some mode where it can function like grpc? Maybe i read the documentation wrong on istio and envoy, but if we were to use something like grpcweb, that would still terminate at the proxy level, and the handoff between the ingress proxy to service A envoy side car would be a more traditional rep/req per request model (tear up connection on a per request basis)? Did i miss something? Or is there a solution to this problem.