GKE ESP Failed to fetch service account token

7/17/2019

I got gRPC service in GKE with 3 pods. Each pod got attached ESP container with rollout_strategy=managed and backend pointing to gRPC.

Sometimes when I hit http endpoint I got

{
 "code": 13,
 "message": "Failed to fetch service account token",
 "details": [
  {
   "@type": "type.googleapis.com/google.rpc.DebugInfo",
   "stackEntries": [],
   "detail": "internal"
  }
 ]
}

Every third request failed. When I deleted every pod, it started to works. I assume it will fail after one hour when the token will be expired.

Deployment ESP container

 containers:
      - args:
        - --http_port=80
        - --backend=grpc://127.0.0.1:6000
        - --service=service.endpoints.cluster.cloud.goog
        - --rollout_strategy=managed
        image: gcr.io/endpoints-release/endpoints-runtime:1
        imagePullPolicy: IfNotPresent
        name: esp
        ports:
        - containerPort: 80
          protocol: TCP
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File

In container logs there are errors like this:

2019/07/17 07:51:05[error]10#10: Failed to call https://servicecontrol.googleapis.com/v1/services/service.endpoints.cluster.cloud.goog:report, Error: UNAUTHORIZED: server response status code: 401, Response body: �Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.
[libprotobuf ERROR external/servicecontrol_client_git/src/service_control_client_impl.cc:182] Failed in Report call: Service control request failed with HTTP response code 401

I found the same issue here https://groups.google.com/forum/#!topic/google-cloud-endpoints/0EEY88X8oIY But it's just workaround by providing explicitly service account.

Edit: I tried service account, but there is still problem when pod is trying to refresh token

2019-07-19T11:45:48.031715293Z INFO:Refreshing access_token
 E 
2019-07-19T11:45:48.071914451Z Traceback (most recent call last):
  File ".bootstrap/_pex/pex.py", line 365, in execute
  File ".bootstrap/_pex/pex.py", line 293, in _wrap_coverage
  File ".bootstrap/_pex/pex.py", line 325, in _wrap_profiling
  File ".bootstrap/_pex/pex.py", line 408, in _execute
  File ".bootstrap/_pex/pex.py", line 466, in execute_entry
  File ".bootstrap/_pex/pex.py", line 471, in execute_module
  File "/usr/lib/python2.7/runpy.py", line 192, in run_module
    fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "start_esp/start_esp.py", line 1061, in <module>
  File "start_esp/start_esp.py", line 343, in fetch_service_config
  File "start_esp/fetch_service_config.py", line 151, in make_access_token
  File "/root/.pex/install/oauth2client-4.1.3-py2.py3-none-any.whl.289a6efc12c558f2f0ae1841479ec3ee7a783c1b/oauth2client-4.1.3-py2.py3-none-any.whl/oauth2client/client.py", line 663, in get_access_token
 E 
2019-07-19T11:45:48.074420362Z     self.refresh(http)
 E 
2019-07-19T11:45:48.074424955Z   File "/root/.pex/install/oauth2client-4.1.3-py2.py3-none-any.whl.289a6efc12c558f2f0ae1841479ec3ee7a783c1b/oauth2client-4.1.3-py2.py3-none-any.whl/oauth2client/client.py", line 545, in refresh
 E 
2019-07-19T11:45:48.074430139Z     self._refresh(http)
 E 
2019-07-19T11:45:48.074434982Z   File "/root/.pex/install/oauth2client-4.1.3-py2.py3-none-any.whl.289a6efc12c558f2f0ae1841479ec3ee7a783c1b/oauth2client-4.1.3-py2.py3-none-any.whl/oauth2client/client.py", line 749, in _refresh
 E 
2019-07-19T11:45:48.074440214Z     self._do_refresh_request(http)
 E 
2019-07-19T11:45:48.074444687Z   File "/root/.pex/install/oauth2client-4.1.3-py2.py3-none-any.whl.289a6efc12c558f2f0ae1841479ec3ee7a783c1b/oauth2client-4.1.3-py2.py3-none-any.whl/oauth2client/client.py", line 780, in _do_refresh_request
 E 
2019-07-19T11:45:48.074450391Z     body=body, headers=headers)
 E 
2019-07-19T11:45:48.074455280Z   File "/root/.pex/install/oauth2client-4.1.3-py2.py3-none-any.whl.289a6efc12c558f2f0ae1841479ec3ee7a783c1b/oauth2client-4.1.3-py2.py3-none-any.whl/oauth2client/transport.py", line 282, in request
 E 
2019-07-19T11:45:48.074460859Z     connection_type=connection_type)
 E 
2019-07-19T11:45:48.074465988Z   File "/root/.pex/install/httplib2-0.13.0-py2-none-any.whl.f7b2ff9b447b5741fa19aab52032ab1c7db17f35/httplib2-0.13.0-py2-none-any.whl/httplib2/__init__.py", line 2135, in request
 E 
2019-07-19T11:45:48.074482470Z     cachekey,
 E 
2019-07-19T11:45:48.074487590Z   File "/root/.pex/install/httplib2-0.13.0-py2-none-any.whl.f7b2ff9b447b5741fa19aab52032ab1c7db17f35/httplib2-0.13.0-py2-none-any.whl/httplib2/__init__.py", line 1796, in _request
 E 
2019-07-19T11:45:48.074492698Z     conn, request_uri, method, body, headers
 E 
2019-07-19T11:45:48.074497648Z   File "/root/.pex/install/httplib2-0.13.0-py2-none-any.whl.f7b2ff9b447b5741fa19aab52032ab1c7db17f35/httplib2-0.13.0-py2-none-any.whl/httplib2/__init__.py", line 1737, in _conn_request
 E 
2019-07-19T11:45:48.074502516Z     response = conn.getresponse()
 E 
2019-07-19T11:45:48.074506952Z   File "/usr/lib/python2.7/httplib.py", line 1108, in getresponse
 E 
2019-07-19T11:45:48.074511855Z     raise ResponseNotReady()
 E 
2019-07-19T11:45:48.074516567Z httplib.ResponseNotReady
 E 

Afterwards pod is restarted and everything works for 1 hour.

-- Dasio
google-cloud-endpoints
google-kubernetes-engine
grpc

1 Answer

7/17/2019

GCP has a Metadata server for ESP to get the access token. IF it is overloaded, ESP may fail since it needs the access token to talk to Google services. If this problem happens often, your should file a bug for GKE team.

Other users have run into similar situations with heavy traffic. I have suggested them the same workaround as https://groups.google.com/forum/#!topic/google-cloud-endpoints/0EEY88X8oIY

-- Wayne Zhang
Source: StackOverflow