Multithread request handling in node.js (deployed in Kubernetes and behind Nginx)

10/11/2020

I have a single-threaded, web-based, CPU intensive workload implemented as a Node.js server (express) and being deployed on Kubernetes with no CPU requests/limits (best effort). This workload, on average, takes ~700-800ms to execute on a quad-core physical machine. The server is behind an Nginx load-balancer (all with default configuration). In short, the workload is simply as follows:

for (let $i = 0; $i < 100; $i++) {
    const prime_length = 100;
    const diffHell = crypto.createDiffieHellman(prime_length);
    const key = diffHell.generateKeys('base64');
    checksum(key);
}

I have an event handler in my Express app.js that logs a timestamp in console when it recieves or send a request, as follows:

app.use((req, res, next) => {
    const start = process.hrtime();
    console.log(`Received ${req.method} ${req.originalUrl} from ${req.headers['referer']} at {${now()}} [RECEIVED]`)

    res.on('close', () => {
        const durationInMilliseconds = helper.getDurationInMilliseconds(start);
        console.log(`Closed received ${req.method} ${req.originalUrl} from ${req.headers['referer']} {${now()}} [CLOSED] ${durationInMilliseconds.toLocaleString()} ms`)
    });
    next();
})

I'm sending 3 parallel requests to this service from 3 different physical machines at the same time. All these servers, plus all kubernetes nodes has an NTP enabled and their local times are synced together.

To run the traffic, I ssh into all these 3 servers in separate screen (using linux's screen), prepare the curl command in the command line, and then send and Enter using the following command to run the traffic all at the same time:

screen -S 12818.3 -X stuff "
" & screen -S 12783.2 -X stuff "
" & screen -S 12713.1 -X stuff "
"

From the logs, I can see all 3 requests is being sent at the same time: at 17:26:37.888

Interestingly, the server receives each request right after finishing the next one:

  • Request 1 is received at 17:26:37.922040382 and takes 740.128ms to process
  • Request 2 is received at 17:26:38.663390107 and takes 724.524ms to process
  • Request 3 is received at 17:26:39.388508923 and takes 695.894ms to process

Here is the generated logs in the container (extracted using kubectl logs -l name=s1 --tail=9999999999 --timestamps):

2020-10-11T17:26:37.922040382Z Received GET /s1/cpu/100 from undefined at {1602429997921393902} [RECEIVED]
2020-10-11T17:26:38.662193765Z Closed received GET /s1/cpu/100 from undefined {1602429998661523611} [CLOSED] 740.128 ms
2020-10-11T17:26:38.663390107Z Received GET /s1/cpu/100 from undefined at {1602429998662810195} [RECEIVED]
2020-10-11T17:26:39.387987847Z Closed received GET /s1/cpu/100 from undefined {1602429999387339320} [CLOSED] 724.524 ms
2020-10-11T17:26:39.388508923Z Received GET /s1/cpu/100 from undefined at {1602429999387912718} [RECEIVED]
2020-10-11T17:26:40.084479697Z Closed received GET /s1/cpu/100 from undefined {1602430000083806321} [CLOSED] 695.894 ms

I checked the CPU usage, using both htop and pidstat, and strangely only 1 core is being utilized all the time...

I was expecting that the node.js server receives all requests at the same time and handles them in different threads (by generating new threads), but it seems it's not the case. How can I make node.js handle requests in parallel, and utilize all the cores it has?

Here is my full code:

var express = require('express');
const crypto = require('crypto');
const now = require('nano-time');

var app = express();
function checksum(str, algorithm, encoding) {
  return crypto
      .createHash(algorithm || 'md5')
      .update(str, 'utf8')
      .digest(encoding || 'hex');
}

function getDurationInMilliseconds (start) {
  const NS_PER_SEC = 1e9;
  const NS_TO_MS = 1e6;
  const diff = process.hrtime(start);

  return (diff[0] * NS_PER_SEC + diff[1]) / NS_TO_MS;
}

app.use((req, res, next) => {
  const start = process.hrtime();
  console.log(`Received ${req.method} ${req.originalUrl} from ${req.headers['referer']} at {${now()}} [RECEIVED]`)

  res.on('close', () => {
    const durationInMilliseconds = getDurationInMilliseconds(start);
    console.log(`Closed received ${req.method} ${req.originalUrl} from ${req.headers['referer']} {${now()}} [CLOSED] ${durationInMilliseconds.toLocaleString()} ms`)
  });
  next();
})

app.all('*/cpu', (req, res) => {
  for (let $i = 0; $i < 100; $i++) {
    const prime_length = 100;
    const diffHell = crypto.createDiffieHellman(prime_length);
    const key = diffHell.generateKeys('base64');
    checksum(key);
  }
  res.send("Executed 100 Diffie-Hellman checksums in 1 thread(s)!");
});

module.exports = app;

app.listen(9121)
-- Michel Gokan Khan
express
kubernetes
multithreading
nginx
node.js

2 Answers

10/11/2020

I was expecting that the node.js server receives all requests at the same time and handles them in different threads (by generating new threads), but it seems it's not the case. How can I make node.js handle requests in parallel, and utilize all the cores it has?

By design, node.js runs your Javascript only in a single thread. It my use other threads for certain things such as built-in crypto operations that have an asynchronous calling interface or disk I/O, but anything you've written in pure Javascript, it will only run in a single thread. For CPU-intensive Javascript code, you will need to specifically change the design of your code in order to use multiple CPUs for processing that CPU intensive code.

Your options are to use child processes or WorkerThreads. Probably what you want to do is to set up a set of worker threads (probably one per CPU core) and then create a queue for jobs that should be processed by a worker thread. Then, as a job is inserted into the queue, you see if there is an available worker thread and if so, you immediately send the job off to the worker thread. If not, you wait until a worker thread notifies you that it is finished and is available for the next job.

In node.js WorkerThreads are entirely separate instances of the V8 Javascript engine and you can use messages between worker threads and your main process.

-- jfriend00
Source: StackOverflow

10/11/2020
-- Michel Gokan Khan
Source: StackOverflow