Cloudrun on GKE - Restart issue

5/30/2019

I'm trying to implement long-running tasks to be scheduled on a GKE cluster and executed by CloudRun. The problem I faced was that my container sort of re-ran the code, on the same 'pod', halfway through. This is explained below with code and sample output.

POST method used to kick-off CloudRun from cmdline

curl --header "Content-Type: application/json" --request POST --data '{"[requiredData]":"[data]"}' -v -H "Host: testcrlaa.default.example.com" [istio-ingressgateway/80/tasks]

Code run on CloudRun for testing purposes:

@app.route('/tasks', methods=['POST'])
def create_task():
    if not request.json or not '[requiredData]' in request.json:
        abort(400)
    print("Slept1 @ " + str(datetime.datetime.now()))
    time.sleep(120)
    print("Slept2 @ " + str(datetime.datetime.now()))
    time.sleep(120)
    print("Slept3 @ " + str(datetime.datetime.now()))
    time.sleep(120)
    print("Slept4 @ " + str(datetime.datetime.now()))
    time.sleep(120)
    print("Slept5 @ " + str(datetime.datetime.now()))
    time.sleep(120)
    print("The time now is: {}".format(datetime.datetime.now()))

Output:

Slept1 @ 2019-05-29 14:40:47.781966

Slept2 @ 2019-05-29 14:42:47.880512

Slept3 @ 2019-05-29 14:44:47.976342

Slept1 @ 2019-05-29 14:45:47.793551 <-- 1

Slept4 @ 2019-05-29 14:46:48.069593

Slept2 @ 2019-05-29 14:47:47.892927 <-- 2

Slept5 @ 2019-05-29 14:48:48.169817

Slept3 @ 2019-05-29 14:49:47.987202 <-- 3

...

5 minutes through it re-ran the code from the beginning, while still running the original code.

Logs from queue-proxy/istio-proxy

{"level":"info","ts":"2019-06-14T09:06:45.814Z","caller":"logging/config.go:96","msg":"Successfully created the logger.","knative.dev/jsonconfig":"{\n  \"level\": \"info\",\n  \"development\": false,\n  \"outputPaths\": [\"stdout\"],\n  \"errorOutputPaths\": [\"stderr\"],\n  \"encoding\": \"json\",\n  \"encoderConfig\": {\n    \"timeKey\": \"ts\",\n    \"levelKe
y\": \"level\",\n    \"nameKey\": \"logger\",\n    \"callerKey\": \"caller\",\n    \"messageKey\": \"msg\",\n    \"stacktraceKey\": \"stacktrace\",\n    \"lineEnding\": \"\",\n    \"levelEncoder\": \"\",\n    \"timeEncoder\": \"iso8601\",\n    \"durationEncoder\": \"\",\n    \"callerEncoder\": \"\"\n  }\n}"}
{"level":"info","ts":"2019-06-14T09:06:45.814Z","caller":"logging/config.go:97","msg":"Logging level set to info"}
{"level":"warn","ts":"2019-06-14T09:06:45.815Z","caller":"logging/config.go:65","msg":"Fetch GitHub commit ID from kodata failed: \"ref: refs/heads/release-0.5\" is not a valid GitHub commit ID"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:33","msg":"SERVING_CONFIGURATION=cr-laa"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:33","msg":"SERVING_NAMESPACE=default"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:33","msg":"SERVING_REVISION=cr-laa-x68qg"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:33","msg":"SERVING_AUTOSCALER=autoscaler"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:33","msg":"SERVING_POD_IP=10.32.1.58"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:33","msg":"SERVING_POD=cr-laa-x68qg-deployment-f969878f9-mxc9v"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:33","msg":"SYSTEM_NAMESPACE=knative-serving"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:33","msg":"SERVING_AUTOSCALER_PORT=8080"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:43","msg":"SERVING_AUTOSCALER_PORT=8080"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:33","msg":"CONTAINER_CONCURRENCY=1"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:43","msg":"CONTAINER_CONCURRENCY=1"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:33","msg":"REVISION_TIMEOUT_SECONDS=300"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:43","msg":"REVISION_TIMEOUT_SECONDS=300"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:33","msg":"USER_PORT=8080"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"util/env.go:43","msg":"USER_PORT=8080"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"queue/main.go:260","msg":"Queue container is starting with queue.BreakerParams{QueueDepth:10, MaxConcurrency:1, InitialCapacity:1}","knative.dev/key":"default/cr-laa-x68qg","knative.dev/pod":"cr-laa-x68qg-deployment-f969878f9-mxc9v"}
{"level":"info","ts":"2019-06-14T09:06:45.815Z","logger":"queueproxy","caller":"queue/main.go:358","msg":"SERVING_REQUEST_METRICS_BACKEND=stackdriver","knative.dev/key":"default/cr-laa-x68qg","knative.dev/pod":"cr-laa-x68qg-deployment-f969878f9-mxc9v"}
{"level":"info","ts":"2019-06-14T09:06:45.824Z","logger":"queueproxy","caller":"metrics/stackdriver_exporter.go:72","msg":"Created Opencensus Stackdriver exporter with config &{knative.dev/serving revision stackdriver 60000000000 0  false true knative.dev/serving/revision custom.googleapis.com/knative.dev/revision}","knative.dev/key":"default/cr-laa-x68qg","knat
ive.dev/pod":"cr-laa-x68qg-deployment-f969878f9-mxc9v"}
{"level":"info","ts":"2019-06-14T09:06:45.824Z","logger":"queueproxy","caller":"metrics/config.go:236","msg":"Successfully updated the metrics exporter; old config: <nil>; new config &{knative.dev/serving revision stackdriver 60000000000 0  false true knative.dev/serving/revision custom.googleapis.com/knative.dev/revision}","knative.dev/key":"default/cr-laa-x68q
g","knative.dev/pod":"cr-laa-x68qg-deployment-f969878f9-mxc9v"}
{"level":"info","ts":"2019-06-14T09:06:46.448Z","logger":"queueproxy","caller":"queue/main.go:155","msg":"User-container successfully probed.","knative.dev/key":"default/cr-laa-x68qg","knative.dev/pod":"cr-laa-x68qg-deployment-f969878f9-mxc9v"}
2019/06/14 09:12:41 http: proxy error: context canceled
{"httpRequest": {"requestMethod": "POST", "requestUrl": "/tasks", "requestSize": "26", "status": 503, "responseSize": "15", "userAgent": "curl/7.54.0", "remoteIp": "10.32.1.14:50004", "serverIp": "10.32.1.58", "referer": "", "latency": "300.000387667s", "protocol": "HTTP/1.1"}, "logging.googleapis.com/trace": "[20f44aaedaf762bc]"}
2019/06/14 09:12:42 http: proxy error: context canceled
{"httpRequest": {"requestMethod": "POST", "requestUrl": "/tasks", "requestSize": "26", "status": 502, "responseSize": "0", "userAgent": "curl/7.54.0", "remoteIp": "10.32.1.14:50624", "serverIp": "10.32.1.58", "referer": "", "latency": "249.840220688s", "protocol": "HTTP/1.1"}, "logging.googleapis.com/trace": "[5f04bd98dcab5014]"}

The problem occurs at "2019/06/14 09:12:42 http: proxy error: context canceled"

-- DuDoff
google-cloud-platform
google-cloud-run
google-kubernetes-engine

0 Answers