Kubernetes pods are crashing every 10 minutes due to node app, node::http2::Http2Session::ConsumeHTTP2Data()

3/12/2020

I have kubernetes pods for one node app and each is crashing every 10 minutes or so, and I'd like to understand why and stabilize it.

the pods: $ k get po | grep app

app-655fd5fcc9-4mtjr                                 0/1     CrashLoopBackOff   53         7h35m
app-655fd5fcc9-6kf82                                 1/1     Running            106        16h
app-655fd5fcc9-9tfbp                                 1/1     Running            87         16h
app-655fd5fcc9-g8x7q                                 1/1     Running            53         7h35m
app-655fd5fcc9-nvcc8                                 1/1     Running            102        16h

the logs right before crashing: $ k logs -p app-655fd5fcc9-4mtjr

node[25]: ../src/node_http2.cc:893:ssize_t node::http2::Http2Session::ConsumeHTTP2Data(): Assertion `(flags_ & SESSION_STATE_READING_STOPPED) != (0)' failed.
 1: 0x8fa0c0 node::Abort() [node]
 2: 0x8fa195  [node]
 3: 0x959e02 node::http2::Http2Session::ConsumeHTTP2Data() [node]
 4: 0x959f4f node::http2::Http2Session::OnStreamRead(long, uv_buf_t const&) [node]
 5: 0xa2aad1 node::TLSWrap::ClearOut() [node]
 6: 0xa2b343 node::TLSWrap::OnStreamRead(long, uv_buf_t const&) [node]
 7: 0x9cf801  [node]
 8: 0xa7ae09  [node]
 9: 0xa7b430  [node]
10: 0xa80dd8  [node]
11: 0xa6fe6b uv_run [node]
12: 0x904725 node::Start(v8::Isolate*, node::IsolateData*, std::vector<std::string, std::allocator<std::string> > const&, std::vector<std::string, std::allocator<std::string> > const&) [node]
13: 0x90297f node::Start(int, char**) [node]
14: 0x7f1a8cbd02e1 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
15: 0x8bbe85  [node]
Aborted (core dumped)
npm ERR! code ELIFECYCLE
npm ERR! errno 134
npm ERR! app@1.0.1 start: `node --harmony ./entry-point.js "--max-old-space-size=7168"`
npm ERR! Exit status 134
npm ERR!
npm ERR! Failed at the app@1.0.1 start script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     /root/.npm/_logs/2020-03-12T00_45_17_556Z-debug.log

I read through the $ k describe pods app-655fd5fcc9-4mtjr but there didn't seem to be any relevant helpful info at a glance. I think the issue is with the app anyways.

Where do I begin to start to debug and solve this?

  • Run node entry-point.js directly locally for some time? It's production code, but sometimes you got to run stuff locally.
  • Is there something else from stderr I might be missing?
  • Is there an easy way to catch this unhandled error and upload or send entire logs from /root/.npm/_logs/2020-03-12T00_45_17_556Z-debug.log ?
  • Is each pod running out of memory or bound by CPU? I kept an eye on a pod $ k exec -it app-655fd5fcc9-6kf82 top as it went into CrashLoopBackOff state and the resource usage seemed fine.

My app isn't using node stdlib, http2 directly. There might be some npm module like @google-cloud modules or one of the http request clients. $ ack http2 --js # no results

-- James T.
kubernetes
node.js

2 Answers

4/28/2020

Don't know if it helps someone but for me I was using node v10.16.3 I was facing a similar issue but after moving to v12.14.1 it stopped popping up.

Not sure what exactly could be the cause. But my application is running a loop on a very large array due to which I had been manually running the Garbage cleaner after processing a few chunks. And the above error was popping up after my first cleaning process.

-- Black Mamba
Source: StackOverflow

3/25/2020

The issue was with the app after all. We had old legacy code that ran this func with deeply nested callbacks using polling. It's been refactored to make the func async and do all work in parallel w/ limited throughput, and changing the controller to just await each func call.

The pods are crashing every 1-3 hours instead of every 10 mins. Probably another issue w/ the app.

-- James T.
Source: StackOverflow