I have configured a cluster in Google Cloud Platform using KOPS. I used the "TCP LoadBalancer" option for exposing my services to public. In the current scenario,If a pod running in the cluster reached the maximum request,then the request is redirected to another pod which is in the same cluster. My question is Is it possible to manage or restrict the request handled by a pod,So that I can define a threshold in receiving requests by the pods.
There is no possibility to do this. GCP does not provide this kind of metrics on load balancers. You may find information about something similar to what you want in Documentation.
Backend services
Backend services direct incoming traffic to one or more attached backends. Each backend is composed of an instance group and additional serving capacity metadata. Backend serving capacity can be based on CPU or requests per second (RPS).