NodeJS Event loop – The TODO Lists

NodeJS uses the Event-Driven Architecture for processing its requests where the event loop and the worker pool are two key components.

Event Loop – The TODO List

As we have seen in the previous article (Single Vs Multi Thread), the event loop acts as a TODO list for the main thread. In fact, such a list becomes very essential while it works on multiple requests in parallel; dealing each request in parts.

These events include everything that needs attention from the main thread. For example, the incoming requests, the callbacks from worker pool, various other system generated events, our program generated callbacks and so on.

Hence, it’s important to understand how this TODO list is structured and the rules behind picking these events for execution.

This TODO list, in fact, consists of set of queues; each one being for a specific purpose. Since the main thread keeps picking up the events from one queue after the other in a loop, we call it as event loop.

In this article, we will look at those various queues, their order of execution and their significance in more detail.

 

How does the Event-loop work ?  

As shown in the diagram below, NodeJS event loop has different phases and each phase has a FIFO queue. And, the main processor keeps processing events registered in these queues in a loop.

Processing of each event is called a Tick. After processing an event, the event execution loop picks the next event as shown.

Event Loop : How it works along with the worker pool & the Job Queue

Besides the event loop there is another high priority queue, the JobQueue. This sits right in-front of the event loop in the event processing mechanism. Only if this queue is empty, the process continues inside the event loop from its last point.

We register events in this fast-track queue using process.nextTick() and promise. Moreover, within the JobQueue the process.nextTick has a higher priority over the promise registered tasks.

Phases in Event-loop  

  1. Timers : Functions registered using setTimeOut or setInterval falls into this phase.
    • While coming from a main module, nodeJs initializes the loop and enter from here.
    • Importantly, such events come into the queue only after the specified time interval is over; not when we set them.
  2. Pending IO Callback : NodeJs uses this internally to execute callbacks for some system operations such as types of TCP errors.
  3. Idle and Prepare : This is again another one that NodeJS uses for internal purpose.
  4. Poll : This is one of the most important phase to understand, as it executes most of our request specific events.
    • The phase waits for and executes the asynchronous IO related callbacks. (eg. callbacks from fs.read(), fetch() etc.)
    • It also receives the incoming connections or requests into its queue.
    • If the queue is not empty:
      • It keeps processing its events until the queue is empty or the system configured hard limit is reached.
    • If the queue is empty (i.e. the phase is idle):
      • a. If the loop has any event ready, it moves out in a loop starting with the Check (setImmediate) phase to execute the pending events in all the phases in a loop.
      • b. If no events are there, it keeps waiting for the callbacks from the ongoing asynchronous IO calls.
      • c. If no events are there and we are not waiting on any IO call, the process will proceed for a shut down.
  5. Check: This phase executes the callbacks registered using setImmediate.
  6. Close Callbacks: This phase executes the callbacks associated with the closing events like socket.on('close', fn)
A Layman’s View :

Having seen the various queues and the execution rules, what does it point to ? Why the loop prefers waiting in the loop phase and occasionally goes for a round trip only on a need basis?

Well, perhaps the customers are the king rule also applies here 🙂

The main process gives highest priority to the user requests and it’s callbacks from its associated IO calls. In fact, most of our real time request consists of a chain of synchronous pieces of code and asynchronous IO calls with synchronous code in its callback. Since both run here, this phase would be executing most of our code.

There would be rare cases where we want to do something intermittently with setTimeout or setImmediate. The loop will be addressing them when poll phase will be idle. We also have the option of Promise or process.nextTick(), to handle these asynchronous pieces on a priority basis; without going out of the poll phase.

 

Sample Programs to Understand the Event Loop

 

Case -1 : Starting from a Main Module

Let’s register a printMessage function into various phases of the event loop as shown to test their order of execution.

function printMessage(msg){
  console.log(msg);
}

function testEventLoopSequence(){
  //Functions registered in different phases
  setTimeout(()=>{printMessage("TimerPhase : setTimeout - 1")},0);
  setImmediate(()=>printMessage("Check Phase : setImmediate - 2"));
  Promise.resolve('JobQueue : promise - 3').then((msg)=>printMessage(msg));
  process.nextTick(()=>printMessage('JobQueue : nextStep 4'));


  printMessage('Main Module - This completes before any of the events.\n')
}
testEventLoopSequence();
F:\nodejs\examples\even-loop>node test-execution-sequence-01.js
Main Module - This completes before any of the events.

JobQueue : nextStep 4
JobQueue : promise - 3
TimerPhase : setTimeout - 1
Check Phase : setImmediate - 2

As we can see in the output, the main module completes first before Nodejs picks up any event from the event module.

Secondly, when we are starting a program in Nodejs, it starts the Nodejs process and initializes the event loop. The process starts picking events from the timer phase of the event loop. Hence, we can see the message in the timer phase has printed before the one in check phase even if the order of registration was the opposite.

Third, even after being registered last the message from JobQueue has appeared first because the process always ensures this queue is empty before it looks for any events in the event loop.

Now let us register the same set of things as an IO callback.

 

Case -2 : Sequence also Depends on Your Starting Point.

Now we are repeating our test in case-1 with the following two changes:

  1. Register the message using setTimeout before setImmediate.
  2. Instead of calling directly, call the testEventLoopSequence as a callback from fs.readFile.
const fs = require('fs');

function printMessage(msg){
  console.log(msg);
}

function testEventLoopSequence(){
  //Functions regitered in different phases
  setTimeout(()=>{printMessage("TimerPhase : setTimeout - 1")},0);
  setImmediate(()=>printMessage("Check Phase : setImmediate - 2"));
  Promise.resolve('JobQueue : promise - 3').then((msg)=>printMessage(msg));
  process.nextTick(()=>printMessage('JobQueue : nextStep 4'));

  printMessage('Main Module - This completes before any of the events.\n')
}

//Call testEventLoopSequence as an IO callback
fs.readFile('test-execution-sequence-02.js', testEventLoopSequence);
F:\nodejs\examples\even-loop>node test-execution-sequence-02.js
Main Module - This completes before any of the events.

JobQueue : nextStep 4
JobQueue : promise - 3
Check Phase : setImmediate - 2
TimerPhase : setTimeout - 1

This is a case where the asynchronous IO call is registering its callback into poll phase.

When we look at the output for the sequence of execution, everything remains the same as in case-1 except the last two lines.

Contrary to case-1, here the check phase has come before the timer phase. Its because we have registered these events while executing the callback in the poll phase. Hence, when we go round the loop from the poll phase , the check phase come before the timer phase.

In short, we should remember that setImmediate comes before setTimeout as we usually set those from inside the IO callbacks.

 

Case -3 : Using process.nextTick for key follow up events.

Being a high priority queue it is quite useful in adding event handlers after  creating an object but before any I/O has occurred.

For example let’s say we are creating a server application and we have a listener on start event. Below are the options shown after creating the instance as shown in the inline comments:

const EventEmitter = require('events');
const util = require('util');

function MySpecialServer() {
  EventEmitter.call(this);

   
  //this.emit('running');                      //Case -1
   
  //setImmediate(()=>this.emit('running'));    //Case-2 

  process.nextTick(()=>this.emit('running'));  //Case -3 
}
util.inherits(MySpecialServer, EventEmitter);

const mySpecialServer = new MySpecialServer();
mySpecialServer.on('running', () => {
  console.log('My Special Server has started!');
});

Case-1 : Emitting event at this point (line-10) won’t work as our handler is not set yet(line-17).

Case-2 : Using setImmediate will work, but not before completing any IO calls, if at all has reached right after the server got started.

It’s because the event loop moves to the check phase only when the pool queue is idle.

Case-3 : The option to set using process.nextTick is the best as this will send the event right after the creation and before we handle any event from the event loop.

 

Case -4 : Using process.nextTick for fixing semi-asynchronous functions

The functions should be either synchronous or asynchronous but, anything in-between could create unexpected results.

Take the following function for example where we are internally calling the callback both synchronously and asynchronously on a conditional basis. But, since it has an asynchronous signature, the client is setting the contact in the calling program; unfortunately after making the asynchronous call.

const fs = require('fs');
const isAsync = Math.random() > 0.5;
let contact;


function myAsyncFunction(isAsync, callback) {
  console.log("isAsync value :"+isAsync);
  if(isAsync){
    fs.writeFile('test.txt',"Something",callback);
  }else{
    callback(); //Case-1 :Sometimes callback runs synchronously
    //process.nextTick(callback); //Case-2: The right approach
  }
}

myAsyncFunction(isAsync, ()=>{
  console.log("Sending status to: "+ contact);
});

//Expecting contact to be available when myAsyncFunction completes!
contact="Ron@xyz.com";

In such a case the output may vary as shown below, depending on how we are calling the callback. Hence, a better approach here is to use process.nextTick, shown as case-2, to ensure the callback runs asynchronously all the time.

F:\nodejs\examples\event-loop>node test-semiAsyncFunction.js
isAsync value :true
Sending status to: Ron@xyz.com

F:\nodejs\examples\event-loop>node test-semiAsyncFunction.js
isAsync value :false
Sending status to: undefined