Skip to main content

Node in a Nutshell

Posted by manning_pubs on February 18, 2013 at 8:15 AM PST



Node in a Nutshell

by Alex Young and Marc Harter, authors of Node.js in Practice

We live in a world of highly connected multicore servers, where web applications are expected to scale from dozens of users to millions. New demands are being placed on developers by the real-time nature of the modern web. Developers are looking for fresh solutions to solve scalability issues—whether it’s to take advantage of multiple CPU cores and high I/O demands or to adapt programs to run on clusters of servers. This article, based on chapter 1 of Node.js in Practice, shows how Node fills a gap in the market by attacking the scalability problem head on.

By using an event-based model with a non-blocking core, Node is perfectly suited to the unpredictable nature of scaling I/O-bound applications.

Node has rapidly become a major platform for developing web applications and even UNIX and Windows programs. With support from cloud hosting providers and giants like Microsoft, Node’s future looks bright. The recent release of Node 0.8 has cemented several core features and improved support for Windows.

In this article, you’ll learn about what Node is and some of its history. You’ll also learn about Node’s release cycle, why Node is unique, what sets Node apart, and who’s using it. This should give you enough knowledge to know if Node is right for your projects.

Next we’ll take a brief look at Node, its runtime engine, and its main features.

What is Node?

Node is a platform for developing network applications. It’s built on V8, Google’s JavaScript runtime engine. V8 is unlike traditional interpreters and virtual machines because it eschews bytecode in preference of direct machine code generation. Virtual machines, like the Java virtual machine (JVM) use an intermediate language called bytecode—in this sense, a JVM is an environment in which bytecode can be executed. Technically, other languages can compile to Java bytecode, which is how the Rhino project executes JavaScript.

The key innovation V8 introduced was to generate machine code directly from JavaScript. V8’s author, Lars Bak, was a researcher involved with high-performance virtual machines. He joined Google in 2004 to work on the Chrome browser, and this resulted in the development of V8.

Node isn’t just V8, however. An important part of the Node platform is its core library. This encompasses everything from TCP servers to asynchronous file management. Like most JavaScript environments, Node has several global objects that are always available, and this features similar objects and methods to web browsers—console, and the timer methods are both present.

Node’s module system is based on CommonJS Modules. Files can be loaded with require, and specific methods or objects can be exported using the exports object. Modules can be managed and shared by using npm, which is distributed alongside Node. A command-line tool is included which can be used to install, remove, upgrade, and search for modules. There’s also a website for npm that allows modules to be searched.

Node is released in a stable/unstable cycle—odd numbered releases are unstable. The latest stable version of Node is the 0.8 series, and the latest unstable release is 0.9. API changes between major versions of Node are relatively minimal.

Documentation on API changes can be found in the Node changelog, and on the project’s wiki.

Why Node?

When Node was initially released, it appealed to two types of developers: JavaScript enthusiasts looking for a server-side platform and other developers who wanted an alternative way to write scalable projects without using traditional parallelization techniques like threads.

Node’s core developers embraced asynchronous I/O as a way to improve performance in certain types of applications. JavaScript’s traditional event-based implementation meant it has a relatively convenient and well-understood syntax that suits asynchronous programming.

In a typical programming language, an I/O operation blocks execution until it completes. Node’s asynchronous file and network APIs mean processing can still occur while these relatively slow I/O operations finish. For example, a network game server that broadcasts the game’s state to various players over TCP/IP sockets can perform background tasks, perhaps maintaining the game world, while it sends data to the players.

The fact JavaScript is single-threaded means there are cases where this doesn’t perform as well as a traditional threaded program. For example, intense calculations in a for-loop would causes JavaScript interpreters to block other operations from occurring until the loop is finished. Figure 1 illustrates how multiple callbacks can be executed by activity in a loop. The rectangles represent the callbacks that should be able to run asynchronously, and the large arrow shows the main path of execution.

Figure 1 A complex loop locks the interpreter. Callbacks that would otherwise run asynchronously are completely blocked by the for-loop, even though they’re ready to run.

In figure 1, several callbacks are waiting to run due to events that have been triggered, but they can’t because a for-loop is madly iterating away. To get around this, programs must be designed to break up computationally intensive operations into smaller units of work that can be scaled.
Compare this to figure 2, which represents a refactored, event-based program.

Figure 2 A refactored, event-based program that has no for-loop blocking execution. Here callbacks are free to run when the resources they’re waiting for are ready.

Node and events are almost synonymous, and there’s a reason for that: events are integral to a well-designed Node program. In figure 2, the for-loop has been replaced with smaller chunks of work that can be scheduled alongside other event-based code. Since I/O should be non-blocking, event handlers can run while other code is waiting for I/O results. In general this means designing classes around small servers and EventEmitter, but the Node community has also been quick to adopt other approaches like the publish-subscribe pattern, which are supported by powerful backends of their own.

Another way Node compensates for this disadvantage is by providing a clustering module that allows separate Node processes to work together. A program designed this way is well positioned to take advantage of multicore processors.

What sets Node apart?

Node uses JavaScript, which is widely used and relatively easy to learn due to its C-like syntax. This means it has and continues to attract a wide range of developers from various backgrounds. The major difference between Node and other scripting language is its core is based on asynchronous I/O—rather than needing a separate library to optionally write asynchronous code, the runtime system itself is asynchronous.

This core is libuv, which was created specifically for Node to better support IOCP for Windows and libev for UNIX. IOCP and libev provide asynchronous I/O that operate at a low level, and are used as the basis for Node’s high-performance event loop. The libuv library has the following main features:

  • Non-blocking TCP sockets and named pipes.
  • Timers.
  • Child process management.
  • Asynchronous DNS and file system APIs.
  • High-resolution time.
  • Thread pool scheduling.
  • File system events.
  • IPC and socket sharing.

Node’s core library provides the high-level JavaScript APIs to this low-level functionality. This is one of the reasons why Node is important: we get low-level asynchronous performance from a high-level API in a widely known language.

This places Node in an interesting position in the technological landscape: it can help write scalable, network-oriented programs that make the best use of available I/O throughput, while appealing to a wide range of developers due to JavaScript’s inherent popularity.

Who’s using it?

Node is used by large and small businesses alike, but the early adopters were initially enthusiastic open source developers.

LinkedIn Mobile has been using Node to power key parts of its mobile stack. Community favorite GitHub uses it to efficiently manage download requests, alongside its existing Python and Ruby-based architecture. Microsoft has embraced Node and supports it as part of the Azure platform. There are also companies that have built their success on Node and heavily contributed back to the community. One such example is LearnBoost.

By looking at these users, it’s clear Node solves certain classes of problem very well. GitHub uses Node to compress files on demand—the reason for this is clear once Node’s stream API is understood. Node can read files asynchronously while outputting a compressed file at the same time. Not only is this efficient in terms of I/O, but also simple to write and understand thanks to a combination of JavaScript’s syntax and Node’s underlying technological cocktail.

As another example, consider a web application that sits in front of another API server. A traditional web application would implement this by receiving input from users, then making a synchronous request to the API server, then responding to the user by rendering a web page. Table 1 shows the states that such a web application will go through as a request is made.

Table 1 The typical states that exist in a blocking web application

Step Component Web app state
Make request User’s browser Waiting
Receive request Web app Blocking
Make internal request Web app Blocking
Receive internal request Internal API server Blocking
Respond to internal request Internal API server Blocking
Receive internal request Web app Blocking
Respond to user Web app Blocking
Receive request User's browser Waiting

If the internal API uses HTTP, then this could be extremely slow. A well-written Node replacement would not block at any step of this process. That means multiple requests can be handled by the same process. Node also comes with tools for running several processes, which means the Node solution could easily scale to take advantage of multiple CPU cores.

Node is an excellent choice for developing real-time proxy applications like this. That leads to other classes of applications that fit similar patterns where strong performance is desirable—statistics servers, game backends, on-the-fly data format conversion.

Comparing Node to related technologies

We’ve looked at some of the strengths and weaknesses of Node, and some examples of who’s using it and what it’s good at. You might be wondering, “What can Node do for me? Why should I learn it?” If you’re a JavaScript developer looking to expand into server-side development, then Node provides a unique opportunity to repurpose your existing skills. There’s no denying that this is attractive, but even if you’re an experienced server-side developer then Node has plenty of things to offer. Table 2 shows some of Node’s core modules alongside developer roles.

Table 2 How can I use Node in my job?

Role Related modules Description
Web developer HTTP, Zlib Node has built-in HTTP client and server modules that can be used to create simple web applications out of the box.
Sysadmin, DevOps File System, net, DNS, process, os Node is ideal for creating programs that integrate with Unix systems, and for writing background daemons and servers for integration projects, network proxies, and so on.
Game developer net, UDP As a high-level language, JavaScript helps rapidly develop technologies like game servers that require both speed and flexibility.

The combination of JavaScript’s callbacks and simple object model makes it a natural language for working with evented, non-blocking APIs. It’s good at effectively multiplexing several servers in the same process—when writing unit tests for a server, it’s easy to simply combine the server and client code in the same file and process. This makes what could otherwise be complex code easy to follow.

Node is particularly beneficial to developers who work with network-oriented software. HTTP-based services are only one aspect of Node development. If you work with custom networking protocols, perhaps related to secure messaging, VoIP, or game servers, then Node provides an attractive alternative to low-level languages like C and C++. Table 3 shows some of these benefits alongside the related Node modules.

Table 3 How can Node benefit me?

Feature Related modules Description
Evented I/O EventEmitter, Buffer, Stream, File System, net, DNS, HTTP Most of Node’s modules are built around asynchronous, non-blocking I/O.
Networking libraries net, DNS, HTTP Node is a convenient platform for writing network-oriented software.
File system File System, Zlib Both synchronous and asynchronous file system APIs make reading and writing files simple and fast.
OS integration and scripting os, Child Process, TTY, process Creating child processes, managing results, and calling OS-specific features is all catered for.
Scaling Cluster, Domain Scaling out to multiple processes and managing errors is supported by Node’s core modules.

Summary

We’ve seen how JavaScript and Node have evolved to tackle problems faced by many developers in this I/O-bound world. We’ve explored the landscape of Node, and the fundamental techniques employed by Node developers.


Here are some other Manning titles you might be interested in:

Node.js in Action

Node.js in Action
Mike Cantelon, TJ Holowaychuk and Nathan Rajlich

Sass and Compass in Action

Sass and Compass in Action
Wynn Netherland, Nathan Weizenbaum, Chris Eppstein, and Brandon Mathis

Secrets of the JavaScript Ninja

Secrets of the JavaScript Ninja
John Resig and Bear Bibeault


AttachmentSize
nodejsip001.png24.98 KB
nodejsip002.png18.79 KB
nodejsip003.jpg9.88 KB
nodejsip004.jpg10.76 KB
nodejsip005.jpg10.36 KB
Related Topics >>