NodeJS- Child Process & Worker Threads

A Node.js is very efficient in handling data intensive calls involving lots of IO calls and minimal cpu processing. But, being a single threaded model, any cpu intensive tasks can block it longer; affecting the response time for the other concurrent tasks.

Some examples of such cpu intesive calls could be complex mathematical calculations, parsing long JSON or scrip file. It could be some image or video processing, running batch job and so on. The multi-threaded applications do have a clear advantage in such scenarios. When one thread is busy handling a heavy computational task, the other ones can still proceed with the other request.

Nodejs Options for CPU Intensive Tasks

In case of Node.js, we only had the option of creating a child process, a new Nodejs instance, for running something in parallel. Though it runs independently with its own thread and event loop, creation of a new instance is expensive both in terms of time and memory requirement. In fact, these child processes are whole new instances of V8 with independent pid and io streams(stdin, stdout, stderr).

The only other option to handle the expensive cpu centric operation within Nodejs was to carefully break the tasks into smaller parts and execute it in multiple event-loops. Of course not a generic or a straight forward approach.

Hence, in version 10.5.0, Nodejs introduced the concept of worker threads, for making an independent and parallel processing much lighter. This was similar to the multi-thread model, creating new threads in the same Nodejs instance. Importantly, these threads use their separate event loops in order to work completely independent of the parent thread.

The below diagram summarizes the difference between the worker threads and the child process as discussed above.

Parallel processing options in NodeJs – Worker Thread Vs Child Process

As we will notice below, the implementation for both of these options are very similar. Whereas the child process will be useful in long running parallel processes like running scripts, batch jobs; the lighter worker threads will be very useful in handling smaller computational tasks within a request. For example, the tasks like parsing long JSON files, carrying out complex calculations etc.

However, both being expensive we should always plan to re-use them using a pool for our frequently repeated usage.

Below are the sample code for these child process and worker thread showing how they run independently from the main threads.

Demos on using worker threads and child processes

1. Running Programs using Child Process – fork()

The child_process.fork() method is a special case of child_process.spawn() used specifically to spawn new Node.js processes. The returned ChildProcess supports an additional IPC channel to allows messages communication between the parent and child.

In the following example, the child-process.js contains a mock cpu intensive task, with a configurable response time. The parent program creates the child process using child_process.fork() and passes the input using send() method.

const { spawn, fork } = require('child_process');

function callChildProcess(){
  const childProcess = fork('child-process.js');

  //Set a on-message handler to receive the outcome
  childProcess.on('message', (result) => {
        console.log(`Outcome : ${result}`);
        childProcess.kill('SIGTERM');
  });
  childProcess.on('close', (code) => {
        console.log(`child process exited with code ${code}`);
  });

  //Call the childProcess - for a cpu intensive task taking 5000ms.
  childProcess.send({"command" : "SLEEP", "responseTime" : 5000});
}

callChildProcess();
setTimeout(()=>{console.log("\nTest Parent Event-Loop :cpuIntensiveTask in child process does not block this event in parent process!")}
          ,1000);
function cpuIntensiveTask(responseTime){
  console.log("cpuIntensiveTask started......")
  let startTime=Date.now();
  while(Date.now()-startTime<responseTime){}
  console.log("cpuIntensiveTask completed in :" + (Date.now()-startTime) +" ms.")
  return (Date.now()-startTime);
}

process.on('message', (message) => {
  console.log(message);
  if(message.command=='SLEEP'){
    setTimeout(()=>{console.log("\nTest Child process Event-Loop :cpuIntensiveTask in child process blocks this event in child thread!")}
              ,1000);

    const result = cpuIntensiveTask(message.responseTime);
    process.send("Completed in :"+ result +" ms.");
  }
});

 

If we look at the output, even though the tasks in child process takes 5sec, it does not block the events in the main process.

The child process runs separately without blocking the events in parent process. But, it blocks a similar event set to trigger at the same time inside the child process.

F:\nodejs\samples\childprocess>node main-process.js
{ command: 'SLEEP', responseTime: 5000 }
cpuIntensiveTask started......

Test Parent Event-Loop :cpuIntensiveTask in child process  does not block this event in parent process!
cpuIntensiveTask completed in :5000 ms.
Outcome : Completed in :5000 ms.

Test Child process Event-Loop :cpuIntensiveTask in child process blocks this event in child thread!
child process exited with code null

 

2. Running Programs using Worker Threads

As discussed, whereas the fork() in child-process spawns out a separate NodeJs process, the worker_threads creates a new thread in the existing NodeJs process with a separate event loop. Therefore, the worker threads are independent but, lighter compared to the child process.

The structure of the interaction is very similar to one the child process. The parent and the child threads both send message using postMessage() and receive it using on-message event handler as shown:

const { Worker } = require('worker_threads');

function callWorkerThread(){
  const worker = new Worker('./worker-thread.js');

  //Set worker thread event handlers
  worker.on('message', (result) => {
        console.log(`Outcome in Parent Thread : ${result}`);

        //Delaying the termination of the worker for testing event set in inside it.
        setTimeout(()=>{worker.terminate();},500);
  });
  
  worker.on('exit', (code) => {
        console.log(`worker exited with code ${code}`);
  });

  //Post message to the worker thread.
  worker.postMessage({"command" : "SLEEP", "responseTime" : 5000});
}

callWorkerThread();
setTimeout(()=>{console.log("\nTest Parent Event-Loop :cpuIntensiveTask in child thread does not block this event in parent thread!")}
          ,1000);
const { parentPort } = require('worker_threads');

function cpuIntensiveTask(responseTime){
  console.log("cpuIntensiveTask started......")
  let startTime=Date.now();
  while(Date.now()-startTime<responseTime){}
  console.log("cpuIntensiveTask completed in :" + (Date.now()-startTime) +" ms.")
  return (Date.now()-startTime);
}


parentPort.on('message', (message) => {
  console.log(message);
  if(message.command=='SLEEP'){
    setTimeout(()=>{console.log("\nTest Child Event-Loop :cpuIntensiveTask in child thread blocks this event in child thread!")}
              ,1000);
    const result = cpuIntensiveTask(message.responseTime);
    parentPort.postMessage("Completed in :"+ result +" ms.");
  }
});

 

Below is the output when we run the parent.js. When we compare it with the one in our child process test above, the threads and their event loop both are independent. But, if we look at line-6, it shows the console output from child thread got delayed. It’s because it shares the same io stream used by the Nodejs process along with the main thread.

F:\nodejs\samples\childprocess>node 20-parent-thread.js
{ command: 'SLEEP', responseTime: 5000 }

Test Parent Event-Loop :cpuIntensiveTask in child thread does not block this event in parent thread!
Outcome in Parent Thread : Completed in :5000 ms.
cpuIntensiveTask started......
cpuIntensiveTask completed in :5000 ms.

Test Child Event-Loop :cpuIntensiveTask in child thread blocks this event in child thread!
worker exited with code 1

 

3. Running Scripts using Child Process – exec()

The child_process.exec() that extends the child_process.spawn() , spawns a new shell for executing specified command. This is useful in automating script from Node.js programs.

The sample program shows the callback and promise-async versions running a simple command. We can replace that with other OS specific commands or the scripts we want to automate.

The stdout and stderr arguments passed to the callback contains the stdout and stderr output of the child process.

const { exec } = require('child_process');

let child = exec('node -v', (error, stdout, stderr) => {
  if (error) {
    console.error(`exec error: ${error}`);
    return;
  }
  console.log(`stdout: ${stdout}`);
  console.error(`stderr: ${stderr}`);
});

child.on('exit', (code) => {
  console.log(`Child exited with code ${code}`);
});


//Output
F:\nodejs\samples\childprocess>node exec-command.js
Child exited with code 0
stdout: v12.18.3

stderr:
const util = require('util');
const exec = util.promisify(require('child_process').exec);

async function promiseExec() {
  const { stdout, stderr } = await exec('node --version');
  console.log('stdout:', stdout);
  console.error('stderr:', stderr);
}
promiseExec();