<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Pavel Romanov]]></title><description><![CDATA[Software Engineer.

Focused on Node.js and JavaScript.

Here to share my learnings and learn something new.]]></description><link>https://pavel-romanov.com</link><generator>RSS for Node</generator><lastBuildDate>Tue, 21 Apr 2026 13:47:14 GMT</lastBuildDate><atom:link href="https://pavel-romanov.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Writable Streams in Node.js: A Practical Guide]]></title><description><![CDATA[In Node.js, there are 4 primary types of streams: readable, writable, transform, and duplex. In the previous article, we looked at the readable streams in detail.
Perhaps you've heard something about writable streams or even used them. But there is a...]]></description><link>https://pavel-romanov.com/writable-streams-in-nodejs-a-practical-guide</link><guid isPermaLink="true">https://pavel-romanov.com/writable-streams-in-nodejs-a-practical-guide</guid><category><![CDATA[Node.js]]></category><category><![CDATA[JavaScript]]></category><category><![CDATA[Web Development]]></category><category><![CDATA[Reactive Programming]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Sun, 16 Feb 2025 15:49:54 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1739720923881/bf0857f5-e84b-46a3-9a53-5556c6e0e254.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In Node.js, there are 4 primary types of streams: readable, writable, transform, and duplex. In the previous article, we looked at the <a target="_blank" href="https://pavel-romanov.com/exploring-the-core-concepts-of-nodejs-readable-streams">readable streams in detail</a>.</p>
<p>Perhaps you've heard something about writable streams or even used them. But there is always a fine line between "I kind of understand how they work" and "I understand how they work".</p>
<p>After reading this article, you should be confident enough to use writable streams confidently.</p>
<h2 id="heading-what-are-writable-streams">What are Writable Streams?</h2>
<p>As the name suggests, writable streams are meant to write some data to the destination. There is a lot to unpack in this statement.</p>
<p>First, what data do we pass to the writable stream and, where does it come from? It can be anything that JavaScript has access to. For example, if you create a string in a JavaScript application and save it as a variable, you can access it directly.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">const</span> name = <span class="hljs-string">'John'</span>;

<span class="hljs-comment">// We can get the name value right away</span>
<span class="hljs-built_in">console</span>.log(name);
</code></pre>
<p>However, the data that you want to work with is not always accessible right away. You must make the data accessible for JavaScript first, and only then can you use it.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;

<span class="hljs-keyword">const</span> stream = createReadStream(<span class="hljs-string">'file-name.txt'</span>);

stream.on(<span class="hljs-string">'data'</span>, <span class="hljs-function">(<span class="hljs-params">chunk</span>) =&gt;</span> {

    <span class="hljs-comment">// Now we have access to the data from the file</span>
});
</code></pre>
<p>In this example, the data we want to work with is inside the <code>file-name.txt</code> file. This file resides on the file system and, JavaScript doesn't have direct access to it. That's why we first need to make the file data available in JavaScript using the <code>createReadStream</code> function.</p>
<p>If you're not comfortable with the readable streams yet, check out the previous article to <a target="_blank" href="https://pavel-romanov.com/exploring-the-core-concepts-of-nodejs-readable-streams">better understand the readable streams</a>.</p>
<p>Okay, the part about data we write into the stream should be clear now. The next question is, what is the destination, and what does it mean that stream writes data into the destination?</p>
<p>The destination can be anything:</p>
<ul>
<li><p>Database</p>
</li>
<li><p>Network socket</p>
</li>
<li><p>File</p>
</li>
<li><p>S3 cloud storage</p>
</li>
<li><p>Standard output</p>
</li>
<li><p>Other streams</p>
</li>
<li><p>JavaScript structures like arrays and objects</p>
</li>
</ul>
<p>The last point doesn't make much sense, but you get the idea. The destination can be virtually anything.</p>
<p>This model looks similar to readable streams in the sense that we have 3 main parts involved:</p>
<ul>
<li><p>The data</p>
</li>
<li><p>The stream</p>
</li>
<li><p>The destination</p>
</li>
</ul>
<p>The difference is that readable streams can get data from anywhere into your JavaScript application, and writable streams can write data anywhere from the data that is available in your application.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1739719736670/36e2ba25-f92d-4ca1-aa48-45862fc713e4.jpeg" alt class="image--center mx-auto" /></p>
<p><img src="./assets/figures/difference-between-readable-and-writable-streams.jpeg" alt="The difference between readable and writable streams" /></p>
<h2 id="heading-writing-data-to-the-destination-of-writable-stream">Writing data to the destination of Writable Stream</h2>
<p>We use the <code>write</code> method to write data to the destination.</p>
<p>Here is how it works:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createWriteStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;

<span class="hljs-keyword">const</span> writableStream = createWriteStream(<span class="hljs-string">'input.txt'</span>);

writableStream.write(<span class="hljs-string">'Hello World!'</span>);
</code></pre>
<p>In this example, we create a writable stream with the destination <code>input.txt</code> file. Whenever we try to write something inside that stream, it gets transferred into the <code>input.txt</code> file.</p>
<p>After calling the <code>write</code> method, we can go to the <code>input.txt</code> file and see that it now contains <code>Hello World!</code> text.</p>
<blockquote>
<p><strong>Note</strong>: When writing data to a stream, you have to be aware of the backpressure mechanism. It prevents memory overflow by controlling the rate at which data is written into the stream. You'll learn more about it in the upcoming article where we dive deep into how backpressure works in writable streams.</p>
</blockquote>
<p>While it might look OK to work with the <code>write</code> method manually, imagine if we were to have a bit more complicated setup.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream, createWriteStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;
<span class="hljs-keyword">import</span> { S3WritableStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'utils/s3-writable-stream'</span>;

<span class="hljs-keyword">const</span> s3WritableStream = <span class="hljs-keyword">new</span> S3WritableStream(<span class="hljs-string">'s3-destination-url'</span>);
<span class="hljs-keyword">const</span> readableStream = createReadStream(<span class="hljs-string">'input.txt'</span>);

readableStream.on(<span class="hljs-string">'data'</span>, <span class="hljs-function">(<span class="hljs-params">chunk</span>) =&gt;</span> {
    s3WritableStream.write(chunk);
});
</code></pre>
<p>Doesn't look too bad so far, right? Now, let's factor in that we have to handle errors and cleanups for each stream properly.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream, createWriteStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;
<span class="hljs-keyword">import</span> { S3WritableStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'utils/s3-writable-stream'</span>;

<span class="hljs-keyword">const</span> s3WritableStream = <span class="hljs-keyword">new</span> S3WritableStream(<span class="hljs-string">'s3-destination-url'</span>);
<span class="hljs-keyword">const</span> readableStream = createReadStream(<span class="hljs-string">'input.txt'</span>);

readableStream.on(<span class="hljs-string">'data'</span>, <span class="hljs-function">(<span class="hljs-params">chunk</span>) =&gt;</span> {
    s3WritableStream.write(chunk);
});

s3WritableStream.on(<span class="hljs-string">'error'</span>, <span class="hljs-function">() =&gt;</span> {
    <span class="hljs-comment">// Clean up the resources related to the stream.</span>
    <span class="hljs-comment">// Perhaps you want to close the other streams at this point.</span>
});

readableStream.on(<span class="hljs-string">'error'</span>, <span class="hljs-function">() =&gt;</span> {
    <span class="hljs-comment">// Clean up the resources related to the stream.</span>
    <span class="hljs-comment">// Perhaps you want to close the other streams at this point.</span>
});
</code></pre>
<p>As you can see, if we start building more-or-less production-grade applications, we have to think about proper error handling and resource cleanups.</p>
<p>When it comes to this, working with multiple streams by calling the <code>write</code> method becomes way too verbose and repetitive. There is a better way to do it, and it is by building data flows using functions like <code>pipeline</code> and <code>pipe</code>.</p>
<p>But before diving into the data flows, I want to address the strange-looking writable stream that is used in the code examples named <code>S3WritableStream</code>.</p>
<h2 id="heading-creating-a-custom-writable-stream">Creating a custom writable stream</h2>
<p>This strange-looking stream is a custom writable stream. It is created to handle a specific pattern of using a writable stream.</p>
<p>In our case, the <code>S3WritableStream</code> is designed to handle the write operation into the S3 bucket.</p>
<p>We don't need to write all of the connection and authentication logic. The custom stream handles all of it. The only thing we need to specify is a URL where we want to store the data.</p>
<p>The benefit of such custom streams is the same as creating a function, class, or any other reusable unit - we encapsulate the complex logic of handling S3 workflow inside of the stream and can reuse it across the project later on.</p>
<p>Here is what pseudo implementation of <code>S3WritableStream</code> might look like:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { Writable } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:stream'</span>;

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">S3WritableStream</span> <span class="hljs-keyword">extends</span> <span class="hljs-title">Writable</span> </span>{
    #url

    <span class="hljs-keyword">constructor</span>(url) {
        <span class="hljs-built_in">super</span>();

        <span class="hljs-built_in">this</span>.#url = url;
    }

    _write(chunk, encoding, callback) {
        <span class="hljs-comment">// Logic to write data into the S3 bucket</span>
    }

    _final(callback) {
        <span class="hljs-comment">// Logic to finalize the stream workflow</span>
    }

    _destroy(error, callback) {
        <span class="hljs-comment">// Handle the destroy even by cleaning up resources</span>
        <span class="hljs-comment">// and the error.</span>
    }
}
</code></pre>
<p>We can encapsulate a lot of low-level details in a custom stream abstraction.</p>
<p>While custom streams can significantly simplify a particular type of workflow, we still have to think about combining multiple streams properly to build a data flow.</p>
<h2 id="heading-building-data-flows-with-writable-streams">Building data flows with Writable Streams</h2>
<p>Writable streams are powerful, but they get to the next level when we start building data flows(aka pipelines) using them.</p>
<p>For example, we can combine readable and writable streams into a single pipeline. By doing that, data from the readable stream gets automatically transferred into a writable stream.</p>
<p>Here is a simple example of piping two streams.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream, createWriteStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;

<span class="hljs-keyword">const</span> readableStream = createReadStream(<span class="hljs-string">'input.txt'</span>);
<span class="hljs-keyword">const</span> writableStream = createWriteStream(<span class="hljs-string">'output.txt'</span>);

readableStream.pipe(writableStream);
</code></pre>
<p>Here, we call the <code>pipe</code> method of a readable stream and pass the writable stream as the destination where the data should be forwarded.</p>
<p>No need to listen for data events on the readable stream and <code>call</code> write in a callback.</p>
<p>We can achieve almost the same result using the <code>pipeline</code> function.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream, createWriteStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;
<span class="hljs-keyword">import</span> { pipeline } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:stream'</span>;


<span class="hljs-keyword">const</span> readableStream = createReadStream(<span class="hljs-string">'input.txt'</span>);
<span class="hljs-keyword">const</span> writableStream = createWriteStream(<span class="hljs-string">'output.txt'</span>);

pipeline(readableStream, writableStream);
</code></pre>
<p>What is the difference between them? Why do we need to have multiple functions that are doing the same thing? While they might look similar there is huge difference, let's get into more details.</p>
<h3 id="heading-pipe-doesnt-have-a-promise-base-api"><code>pipe</code> doesn't have a promise-base API</h3>
<p>At this point, promises are the standard for working with asynchronous operations in Node.js.</p>
<p>It is way more convenient to use <code>async/away</code> syntax whenever possible. Thanks to it, we can read async code as a series of synchronous operations.</p>
<p>Because of that it is way easier to track the logic flow compared to events.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream, createWriteStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;

<span class="hljs-keyword">const</span> readableStream = createReadStream(<span class="hljs-string">'input.txt'</span>);
<span class="hljs-keyword">const</span> writableStream = createWriteStream(<span class="hljs-string">'output.txt'</span>);

readableStream.pipe(writableStream);

<span class="hljs-built_in">console</span>.log(<span class="hljs-string">'Operation finished'</span>);
</code></pre>
<p>In the case of using, <code>pipe</code> we'll see the console log output first, and only then the operation will finish.</p>
<p>Compare it to using the <code>pipeline</code> function with promises API.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream, createWriteStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;
<span class="hljs-keyword">import</span> { pipeline } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:stream/promises'</span>;

<span class="hljs-keyword">const</span> readableStream = createReadStream(<span class="hljs-string">'input.txt'</span>);
<span class="hljs-keyword">const</span> writableStream = createWriteStream(<span class="hljs-string">'output.txt'</span>);

<span class="hljs-keyword">await</span> pipeline(readableStream, writableStream);

<span class="hljs-built_in">console</span>.log(<span class="hljs-string">'Operation finished'</span>);
</code></pre>
<h3 id="heading-poor-error-handling-by-pipe-method">Poor error handling by <code>pipe</code> method</h3>
<p>When working with streams, you shouldn't forget about proper error handling.</p>
<p>Both <code>pipe</code> and <code>pipeline</code> can handle errors, but the way they do it differs significantly.</p>
<pre><code class="lang-javascript">readableStream
    .pipe(transformStream)
    .pipe(writableStream)
    .on(<span class="hljs-string">'error'</span>, <span class="hljs-function">(<span class="hljs-params">e</span>) =&gt;</span> {
        <span class="hljs-comment">// Handle the error</span>
    });
</code></pre>
<p>The biggest catch here is that <code>on('error')</code> it only catches errors from the <code>writableStream</code> (the last one on the chain).</p>
<p>If you want to handle errors properly for each of the streams, you have to add error listeners for <strong>each</strong> of the streams involved in the pipeline.</p>
<pre><code class="lang-javascript">readableStream
    .on(<span class="hljs-string">'error'</span>, <span class="hljs-function">(<span class="hljs-params">e</span>) =&gt;</span> {
        <span class="hljs-comment">// Handle error</span>
    })
    .pipe(transformStream)
    .on(<span class="hljs-string">'error'</span>, <span class="hljs-function">(<span class="hljs-params">e</span>) =&gt;</span> {
        <span class="hljs-comment">// Handle error</span>
    })
    .pipe(writableStream)
    .on(<span class="hljs-string">'error'</span>, <span class="hljs-function">(<span class="hljs-params">e</span>) =&gt;</span> {
        <span class="hljs-comment">// Handle the error</span>
    });
</code></pre>
<p>Doesn't look too nice. Now let's compare it with <code>pipeline</code> API.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">await</span> pipeline(
        readableStream,
        transformStream,
        writableStream
    )
} <span class="hljs-keyword">catch</span> (error) {
    <span class="hljs-comment">// Handle error</span>
}
</code></pre>
<p>Unlike <code>pipe</code> the <code>pipeline</code> function is able to handle errors for each of the streams involved in the pipeline. How cool is that?</p>
<h3 id="heading-the-pipe-method-doesnt-clean-up-resources-properly">The <code>pipe</code> method doesn't clean up resources properly</h3>
<p>You probably noticed that we're here to roast the <code>pipe</code> method.</p>
<p>The next problem it has is the absence of a proper mechanism to clean up resources.</p>
<p>It means that if we have a stream involved in a pipeline and some other stream in the pipeline errors out, the <code>pipe</code> method doesn't close other streams automatically leaving them and their resources handing in the memory.</p>
<pre><code class="lang-javascript"><span class="hljs-comment">// Using pipe() - resources aren't properly cleaned up</span>
<span class="hljs-keyword">const</span> transform = <span class="hljs-keyword">new</span> Transform({
  transform(chunk, encoding, callback) {
    <span class="hljs-keyword">if</span> (someCondition) {
      callback(<span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'Something went wrong'</span>));
      <span class="hljs-comment">// Other streams in the pipeline stay open!</span>
      <span class="hljs-keyword">return</span>;
    }
    callback(<span class="hljs-literal">null</span>, chunk);
  }
});

transform.on(<span class="hljs-string">'error'</span>, <span class="hljs-function">(<span class="hljs-params">error</span>) =&gt;</span> {

    <span class="hljs-comment">// Have to manually clean up streams.</span>
    writableStream.destroy();
    readableStream.destroy();
})

readableStream
  .pipe(transform)
  .pipe(writableStream);
</code></pre>
<p>You need to track suck cases when working with <code>pipe</code> and always clean up resources involved in the pipeline manually for <strong>every</strong> possible error inside of the pipeline.</p>
<p>On the other hand, <code>pipeline</code> the function handles it automatically.</p>
<pre><code class="lang-javascript"><span class="hljs-comment">// Using pipeline() - automatic cleanup of all resources</span>
<span class="hljs-keyword">try</span> {
  <span class="hljs-keyword">await</span> pipeline(
    readableStream,
    transform,
    writableStream
  );
} <span class="hljs-keyword">catch</span> (error) {
  <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error:'</span>, error);
  <span class="hljs-comment">// All streams are automatically destroyed</span>
  <span class="hljs-comment">// No memory leaks or hanging resources</span>
}
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Writable streams are responsible for getting data from your Node.js application and transferring it to a destination that often lies outside of your application.</p>
<p>We have several ways of writing data into the writable stream. The most fundamental one is <code>write</code> method that we call in a writable stream and pass the data to be written as an argument.</p>
<p>We can customize the behavior of how exactly <code>write</code> and other methods of a writable stream works by creating a custom stream and extending it from the <code>Writable</code> class.</p>
<p>But even having a custom writable stream won't help us to solve the problem of building complex data flows where we want to transfer data from one source to another without much hustle. That's where <code>pipe</code> and <code>pipeline</code> functions come in handy.</p>
<p>While both <code>pipe</code> and <code>pipeline</code> functions have the same end goal to build pipelines of data, the implementation details are quite different when using them.</p>
<p>In general, <code>pipeline</code> has a much simpler API and handles errors and resources cleanup way better compared to <code>pipe</code>. Most of the time, you should stick with the <code>pipeline</code>.</p>
]]></content:encoded></item><item><title><![CDATA[Exploring the Core Concepts of Node.js Readable Streams]]></title><description><![CDATA[In Node.js we have different types of streams, and one of them is the Readable stream. You may have heard of it, or perhaps even used it a few times.
But do you know how to use it effectively? This question of efficiency comes when we're dealing with...]]></description><link>https://pavel-romanov.com/exploring-the-core-concepts-of-nodejs-readable-streams</link><guid isPermaLink="true">https://pavel-romanov.com/exploring-the-core-concepts-of-nodejs-readable-streams</guid><category><![CDATA[Node.js]]></category><category><![CDATA[JavaScript]]></category><category><![CDATA[Web Development]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Thu, 05 Dec 2024 02:30:19 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1733365717036/ae8c36c1-742c-4cb1-aa10-f07649028502.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In Node.js we have different types of streams, and one of them is the Readable stream. You may have heard of it, or perhaps even used it a few times.</p>
<p>But do you know how to use it effectively? This question of efficiency comes when we're dealing with cases that go beyond basics. In such cases, a deeper understanding of the underlying mechanisms is important for making informed decisions.</p>
<p>This article explores the core concepts of Node.js Readable streams. After reading it you'll deepen your understanding how they work and when they can be used. As a bonus point, we'll see why you should be careful when playing with the <code>highWaterMark</code> property of readable streams.</p>
<h2 id="heading-use-cases-of-readable-streams">Use cases of readable streams</h2>
<p>Here are few examples of of readable streams can be used.</p>
<h3 id="heading-streaming-data-from-database">Streaming data from database</h3>
<p>If we have a large dataset in a database or each single document is large by itself, we might want to stream documents from the database, instead of trying to load all them into memory at once.</p>
<p>Here is an example of we can do so by using `Readable` stream and MongoDB.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { Readable } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:stream'</span>; 

<span class="hljs-comment">// Leaving behind the scene all MongoDB configuration and connection setup</span>
<span class="hljs-comment">// before getting the actual reference to `db` object.</span>

<span class="hljs-comment">// Collection cursor</span>
<span class="hljs-keyword">const</span> cursor = db.collection(<span class="hljs-string">'documents'</span>).cursor();
<span class="hljs-keyword">const</span> collectionStream = <span class="hljs-keyword">new</span> Readable({
  <span class="hljs-attr">objectMode</span>: <span class="hljs-literal">true</span>,

  <span class="hljs-comment">// We're streaming objects, not buffers</span>
  <span class="hljs-keyword">async</span> read(size) {
    <span class="hljs-keyword">try</span> {
      <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> cursor.next();
      <span class="hljs-keyword">if</span> (result) {
        <span class="hljs-built_in">this</span>.push(result);
      } <span class="hljs-keyword">else</span> {
        <span class="hljs-built_in">this</span>.push(<span class="hljs-literal">null</span>); <span class="hljs-comment">// Signal the end of the stream</span>
        <span class="hljs-keyword">await</span> client.close();
      }
    } <span class="hljs-keyword">catch</span> (err) {
      <span class="hljs-built_in">this</span>.destroy(err); <span class="hljs-comment">// Handle errors by destroying the stream</span>
    }
  },
});
</code></pre>
<p>And after that we can use it in the following way:</p>
<pre><code class="lang-javascript">collectionStream.on(<span class="hljs-string">'data'</span>, <span class="hljs-function">(<span class="hljs-params">doc</span>) =&gt;</span> {
  <span class="hljs-comment">// Process the document</span>
});
</code></pre>
<p>Diagram of the workflow.</p>
<pre><code class="lang-mermaid">sequenceDiagram
    Client-&gt;&gt;Readable Stream: Subscribes to the stream
    Readable Stream-&gt;&gt;Database: Reads data from the cursor
    Database-&gt;&gt;Readable Stream: Cursor returns document
    Readable Stream-&gt;&gt;Client: Returns document recieved from the cursor
</code></pre>
<h3 id="heading-streaming-file-from-s3-bucket-directly-into-the-application">Streaming file from S3 bucket directly into the application</h3>
<p>Most of applications have some kind of workflow that involves files. Often, it is made by leveraging cloud storage services like AWS S3.</p>
<p>If at some point you need to download a file from S3 and process it in your application, Readable streams are one of the best options to do so.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { S3Client, GetObjectCommand } <span class="hljs-keyword">from</span> <span class="hljs-string">'@aws-sdk/client-s3'</span>;

<span class="hljs-comment">// Configure AWS credentials and region</span>
<span class="hljs-keyword">const</span> s3Client = <span class="hljs-keyword">new</span> S3Client({
  <span class="hljs-attr">region</span>: <span class="hljs-string">'YOUR_AWS_REGION'</span>,
  <span class="hljs-attr">credentials</span>: {
    <span class="hljs-attr">accessKeyId</span>: <span class="hljs-string">'YOUR_ACCESS_KEY_ID'</span>,
    <span class="hljs-attr">secretAccessKey</span>: <span class="hljs-string">'YOUR_SECRET_ACCESS_KEY'</span>,
  },
});

<span class="hljs-keyword">const</span> command = <span class="hljs-keyword">new</span> GetObjectCommand({
  <span class="hljs-attr">Bucket</span>: <span class="hljs-string">'my-bucket'</span>,
  <span class="hljs-attr">Key</span>: <span class="hljs-string">'path/to/file.txt'</span>,
});
<span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> s3Client.send(command);
<span class="hljs-keyword">const</span> s3Stream = response.Body;

s3Stream.on(<span class="hljs-string">'data'</span>, <span class="hljs-function">(<span class="hljs-params">chunk</span>) =&gt;</span> {
  <span class="hljs-comment">// Process the data chunk</span>
});
</code></pre>
<p>This approach makes the process of downloading files from S3 more efficient since we're not waiting for the whole file to be transferred through the network.</p>
<pre><code class="lang-mermaid">sequenceDiagram
    Client-&gt;&gt;Readable Stream: Subscribes to the stream
    Readable Stream-&gt;&gt;S3 bucket: Reads a file from the bucket
    S3 bucket-&gt;&gt;Readable Stream: Start transferring the file
    Readable Stream-&gt;&gt;Client: Returns file chuncks as soon as they are available
</code></pre>
<h3 id="heading-zlib-compression-and-decompression">Zlib compression and decompression</h3>
<p>As <a target="_blank" href="https://nodejs.org/api/zlib.html#zlib">Node.js documentation states</a>:</p>
<blockquote>
<p>Compression and decompression are built around the Node.js Streams API.</p>
</blockquote>
<p>Meaning that by using zlib API, you're working with streams. You can find the following example in the official Node.js documentation.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createGzip } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:zlib'</span>;
<span class="hljs-keyword">import</span> { pipeline } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:stream'</span>;
<span class="hljs-keyword">import</span> { createReadStream, createWriteStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:stream'</span>;

<span class="hljs-keyword">const</span> gzip = createGzip();
<span class="hljs-keyword">const</span> source = createReadStream(<span class="hljs-string">'input.txt'</span>);
<span class="hljs-keyword">const</span> destination = createWriteStream(<span class="hljs-string">'input.txt.gz'</span>);

pipeline(source, gzip, destination, <span class="hljs-function">(<span class="hljs-params">err</span>) =&gt;</span> {
  <span class="hljs-comment">// Handle the error</span>
});
</code></pre>
<p>There are other Node.js APIs and modules that leverage the readable streams:</p>
<ul>
<li><p>TCP Socket</p>
</li>
<li><p>HTTP request and response</p>
</li>
<li><p>Process sdtin and stderr</p>
</li>
</ul>
<p>Now that we've explored some common use cases, let's dive deeper in how readable streams work under the hood and learn about different reading modes and flowing states.</p>
<h2 id="heading-reading-modes-and-flowing-states">Reading modes and flowing states</h2>
<p>Every readable stream in Node.js operates in one of two modes: flowing or paused. These modes dictate how you receive data from a readable stream, much like how you might control water flow in a plumbing system.</p>
<p>P.S. If you're not familiar with the analogy of pipes and plumbing system, check out the previous article where we <a target="_blank" href="https://pavel-romanov.com/building-a-mental-model-of-nodejs-streams">build a mental model of how streams work in Node.js</a> using the pipe analogy.</p>
<h3 id="heading-flowing-mode-the-automatic-approach">Flowing mode: The automatic approach</h3>
<p>In flowing mode, data is read from the underlying system automatically and provided to your application as quickly as possible. This is similar to water flowing freely through an open pipe. One way to turn the flowing mode on is to attach the `data` event listener to the stream:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;

<span class="hljs-keyword">const</span> filePath = <span class="hljs-string">'path/to/a/file.txt'</span>;
<span class="hljs-keyword">const</span> stream = createReadStream(filePath);

<span class="hljs-comment">// Once we attach this listener, data starts flowing automatically</span>
stream.on(<span class="hljs-string">'data'</span>, <span class="hljs-function">(<span class="hljs-params">chunk</span>) =&gt;</span> {
  <span class="hljs-comment">// Process the data chunk</span>
});
</code></pre>
<p>This approach is perfect for scenarios where you want to process data as quickly as possible, such as streaming log files or processing real-time data.</p>
<h3 id="heading-paused-mode-the-manual-approach">Paused mode: The manual approach</h3>
<p>In paused mode, you control the flow of data. One way to explicitly request each chunk of data is by using the `stream.read()` method. Think of paused mode like a water dispenser with a button; you press the button only when you want water.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;

<span class="hljs-keyword">const</span> filePath = <span class="hljs-string">'path/to/a/file.txt'</span>;
<span class="hljs-keyword">const</span> stream = createReadStream(filePath);

<span class="hljs-comment">// Later in your code, when you need to read data</span>
<span class="hljs-keyword">const</span> chunk = stream.read();

<span class="hljs-keyword">if</span> (chunk !== <span class="hljs-literal">null</span>) {
  <span class="hljs-comment">// Process the data chunk</span>
}
</code></pre>
<p><strong>Warning</strong>, don't try to mix these two modes. It will lead to unexpected behavior that is hard to debug.</p>
<h3 id="heading-readable-flowing-states">Readable flowing states</h3>
<p>These two reading modes are a simplified view of the underlying abstraction that Node.js operates with, called the readable flowing state. This state is represented by the readableFlowing property of the readable stream.</p>
<p>The readableFlowing state can contain one of three values: `null`, `false`, and `true`.</p>
<p><strong>Null</strong>: The initial state of a newly created stream. No consumers are attached to the stream, so it's not actively reading data.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { Readable } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:stream'</span>;
<span class="hljs-keyword">const</span> stream = <span class="hljs-keyword">new</span> Readable({ read(size) {} });

<span class="hljs-built_in">console</span>.log(stream.readableFlowing); <span class="hljs-comment">// null</span>
</code></pre>
<p><strong>False (Paused)</strong>: The stream has consumers but is temporarily paused. Data might be available but won't be delivered until the stream is resumed. This is common when using pause() or switching to manual mode.</p>
<pre><code class="lang-javascript">stream.on(<span class="hljs-string">'data'</span>, <span class="hljs-function">(<span class="hljs-params">chunk</span>) =&gt;</span> {
  <span class="hljs-comment">// Handle data</span>
});

stream.pause(); <span class="hljs-comment">// Enters paused state</span>

<span class="hljs-built_in">console</span>.log(stream.readableFlowing); <span class="hljs-comment">// false</span>
</code></pre>
<p><strong>True (Flowing)</strong>: The stream is actively delivering data to consumers. Data events are being emitted automatically. This is common when using event-based consumption.</p>
<pre><code class="lang-javascript">stream.on(<span class="hljs-string">'data'</span>, <span class="hljs-function"><span class="hljs-params">chunk</span> =&gt;</span> {
  <span class="hljs-comment">// Handle data</span>
});
<span class="hljs-comment">// Enters flowing state</span>

<span class="hljs-built_in">console</span>.log(stream.readableFlowing); <span class="hljs-comment">// true</span>
</code></pre>
<p><strong>Warning</strong>, don't try to mix these three states. It will lead to unexpected behavior that is hard to debug.</p>
<h2 id="heading-consuming-readable-streams-data">Consuming readable streams data</h2>
<p>The sole purpose of a readable stream is to deliver data to consumers. In Node.js, there are several ways to consume data from a readable stream, each with its own characteristics and use cases.</p>
<p>In this article we'll review only methods that we can use when dealing with a single readable stream such as <code>data</code> event, <code>readable</code> event, and async iterators. In later articles we'll get familiar with <code>pipe</code> and <code>pipeline</code> functions.</p>
<h3 id="heading-using-the-data-event">Using the `data` event</h3>
<p>This is probably the most common approach of consuming data from a readable stream. All we have to do is to attach an event listener to `data` event.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;

<span class="hljs-keyword">const</span> stream = createReadStream(
  <span class="hljs-string">'path/to/a/file/text.txt'</span>,
  { <span class="hljs-attr">encoding</span>: <span class="hljs-string">'utf8'</span> },
);

stream.on(<span class="hljs-string">'data'</span>, <span class="hljs-function">(<span class="hljs-params">chunk</span>) =&gt;</span> {
  <span class="hljs-comment">// Process the data chunk</span>
});
</code></pre>
<p>Whenever the internal buffer is filled with data the readable stream offloads this data into the chunk and you receive it in the callback.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733281457747/eeef8bc5-6685-41ae-ac4e-700df7399807.jpeg" alt class="image--center mx-auto" /></p>
<p>And it happens no matter what. For example, if you have some asynchronous processing of the data chunks it might not be ideal for you.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;

<span class="hljs-keyword">const</span> stream = createReadStream(
  <span class="hljs-string">'path/to/a/file/text.txt'</span>,
  { <span class="hljs-attr">encoding</span>: <span class="hljs-string">'utf8'</span> },
);

stream.on(<span class="hljs-string">'data'</span>, <span class="hljs-keyword">async</span> (chunk) =&gt; {

  <span class="hljs-comment">// Imitation of data processing.</span>
  <span class="hljs-keyword">await</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Promise</span>(<span class="hljs-function">(<span class="hljs-params">resolve</span>) =&gt;</span> <span class="hljs-built_in">setTimeout</span>(resolve, <span class="hljs-number">3000</span>));
  <span class="hljs-built_in">console</span>.log(<span class="hljs-string">'Data chunk: '</span>, chunk);
});
</code></pre>
<p>In this example you'll see most of the console logs pretty much at the same time. The reason is after the buffer gets empty streams starts reading a new chunk of data and it doesn't matter if a processing of a previous hasn't been finished yet.</p>
<h3 id="heading-using-the-readable-event">Using the readable event</h3>
<p>The `readable` event is somewhat similar to `data` event in a way that it invoked when the internal buffer of the readable stream is filled with the data.</p>
<p>However, the handler of the `readable` doesn't offload the buffer automatically. It only gives a signal to a listener that internal buffer is loaded and ready to be ridden. To explicitly read from an internal buffer you can use the `read` method.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;

<span class="hljs-keyword">const</span> stream = createReadStream(
  <span class="hljs-string">'path/to/a/file/text.txt'</span>,
  { <span class="hljs-attr">encoding</span>: <span class="hljs-string">'utf8'</span> },
);

<span class="hljs-comment">// When stream emits the event it means that the internal buffer is filled with the data</span>
stream.on(<span class="hljs-string">'readable'</span>, <span class="hljs-function">() =&gt;</span> {

  <span class="hljs-comment">// We have to read the data manually using the `read` method</span>
  <span class="hljs-keyword">const</span> chunk = stream.read();

  <span class="hljs-comment">// Process data chunk;</span>
});
</code></pre>
<p>Here is a diagram to better show you how data flows to buffer and from buffer when dealing with the `readable` event.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733282240932/3c08138b-d23b-45b7-ad0c-2c5fdf5141e9.jpeg" alt class="image--center mx-auto" /></p>
<p>As you can see, the buffer is still filled with data after event fired.</p>
<p>Such manual handling of when we read the data can be quite handy when you want to control the data flow explicitly. For example, you can read the stream data only after some asynchronous processing has finished the work.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;

<span class="hljs-keyword">const</span> stream = createReadStream(
  <span class="hljs-string">'path/to/a/file/text.txt'</span>,
  { <span class="hljs-attr">encoding</span>: <span class="hljs-string">'utf8'</span> },
);

stream.on(<span class="hljs-string">'readable'</span>, <span class="hljs-keyword">async</span> () =&gt; {
  <span class="hljs-keyword">await</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Promise</span>(<span class="hljs-function">(<span class="hljs-params">resolve</span>) =&gt;</span> <span class="hljs-built_in">setTimeout</span>(resolve, <span class="hljs-number">3000</span>));
  <span class="hljs-keyword">const</span> chunk = stream.read();
  <span class="hljs-built_in">console</span>.log(<span class="hljs-string">'Readable chunk: '</span>, chunk);
});
</code></pre>
<p>In this example you'll see console logs print one by one with an interval of approximately 3 seconds.</p>
<h2 id="heading-using-async-iterator">Using async iterator</h2>
<p>Readable streams implement async iterator interface. It is the most recent API for consuming streams and you can find that members of the core Node.js team recommend using it over `data` or `readable` events most of the time.</p>
<p>The benefit of working with async iterator is that you don't have to deal with any events by yourself. Everything is handled internally.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;

<span class="hljs-keyword">const</span> stream = createReadStream(
  <span class="hljs-string">`<span class="hljs-subst">${<span class="hljs-keyword">import</span>.meta.dirname}</span>/new-text.txt`</span>,
  { <span class="hljs-attr">encoding</span>: <span class="hljs-string">'utf8'</span> },
);

<span class="hljs-keyword">for</span> <span class="hljs-keyword">await</span> (<span class="hljs-keyword">const</span> chunk <span class="hljs-keyword">of</span> stream) {
  <span class="hljs-comment">// Process the data chunk</span>
}
</code></pre>
<p>We're using `for...await` loop to iterate over the data that stream emits.</p>
<p>This approach has the best from both `data` and `readable` events combined into a single API. We receive the chunk of data whenever it is ready and don't have to call the `read` method manually as we do it with the `readable` event.</p>
<p>At the same time, we can perform some asynchronous handling of each chunk and async iterator won't be rushing to read all of the data as `data` event do. It will wait until the processing on a single chunk is finished and only then move forward.</p>
<p>You know at least 3 different ways to consume data from a readable stream now. The other thing that can affect how fast you're getting data from a stream is `highWaterMark` property.</p>
<h2 id="heading-impact-of-highwatermark-on-readable-streams-performance">Impact of ‘highWaterMark’ on readable streams performance</h2>
<p>While the Node.js documentation states that streams in flowing mode emit data as quickly as possible, there's a nuance to this behavior. Data is emitted only after an internal buffer is filled. This buffer size is controlled by the `highWaterMark` property of the stream.</p>
<p>If you set a smaller `highWaterMark` value, the first chunk will be emitted faster because the buffer fills up more quickly. However, this doesn't necessarily mean the overall execution time will be faster. In fact, for larger files, a smaller `highWaterMark` can lead to slower processing time.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;

<span class="hljs-keyword">const</span> filePath = <span class="hljs-string">'data.txt'</span>;

<span class="hljs-comment">// Stream with a small highWaterMark</span>
<span class="hljs-keyword">const</span> stream1 = createReadStream(filePath, { <span class="hljs-attr">highWaterMark</span>: <span class="hljs-number">16</span> });

<span class="hljs-comment">// The first chunk is emitted faster, but the overall processing is slower</span>
stream1.on(<span class="hljs-string">'data'</span>, <span class="hljs-function">(<span class="hljs-params">chunk</span>) =&gt;</span> {
  <span class="hljs-comment">// Process the data</span>
});

<span class="hljs-comment">// Stream with the default highWaterMark (64KB)</span>
<span class="hljs-keyword">const</span> stream2 = createReadStream(filePath);

<span class="hljs-comment">// The first chunk is emitted slightly slower, but the overall processing is faster</span>
stream2.on(<span class="hljs-string">'data'</span>, <span class="hljs-function">(<span class="hljs-params">chunk</span>) =&gt;</span> {
  <span class="hljs-comment">// Process the data</span>
});
</code></pre>
<p>Here are the reasons why:</p>
<ul>
<li><p>Increased overhead: With a smaller buffer, the stream needs to process and emit chunks more frequently, resulting in increased overhead.</p>
</li>
<li><p>Reduced throughput: The constant filling and emptying of a small buffer can limit the overall data throughput compared to a larger buffer.</p>
</li>
</ul>
<p>Therefore, while a smaller `highWaterMark` might provide a quicker initial response, the default `highWaterMark` is generally optimized for efficient processing of larger data streams.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Node.js readable streams are not as simple as you might’ve thought initially. Especially when it comes to understanding of how they work in a certain use cases under highload or complex manipulation. Of course, we haven’t touched on all points but those should be enough for you to improve the overall understanding of the readable steams and start your own researched.</p>
]]></content:encoded></item><item><title><![CDATA[Building a Mental Model of Node.js Streams]]></title><description><![CDATA[Have you ever worked with Node.js streams? What was your experience like?
When I first tried to work with streams, I was confused, to say the least. The concept was completely new to me. I thought I could just ignore them, but it turns out they're ev...]]></description><link>https://pavel-romanov.com/building-a-mental-model-of-nodejs-streams</link><guid isPermaLink="true">https://pavel-romanov.com/building-a-mental-model-of-nodejs-streams</guid><category><![CDATA[Node.js]]></category><category><![CDATA[JavaScript]]></category><category><![CDATA[Web Development]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Tue, 22 Oct 2024 03:52:34 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1729569006139/926ea8a3-7353-4849-9e09-b1ded1bbddff.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Have you ever worked with Node.js streams? What was your experience like?</p>
<p>When I first tried to work with streams, I was confused, to say the least. The concept was completely new to me. I thought I could just ignore them, but it turns out they're everywhere in Node.js. Even core modules like <code>fs</code> and <code>http</code> use streams under the hood. So, I had to learn them and understand how they work.</p>
<p>What helped me was building a strong mental model that consists of multiple concepts. In this article, we'll explore these concepts and build a mental model of Node.js streams together.</p>
<h2 id="heading-what-are-nodejs-streams">What are Node.js Streams?</h2>
<p>The main idea behind streams is that they take pieces of data from one place and transfer them to another. There are 4 important parts that I want to highlight based on this definition:</p>
<ul>
<li><p>Streams transfer data in pieces, not as a whole</p>
</li>
<li><p>Streams transfer pieces of data in a specific size</p>
</li>
<li><p>Streams aren't interested in the transferred data</p>
</li>
<li><p>Streams simply provide a mechanism for data transfer</p>
</li>
</ul>
<p>A common analogy used to describe streams is a pipe. However, this analogy often misses 2 crucial parts: the producer and the consumer. Let's use the same analogy but make it more complete.</p>
<p>Imagine a huge reservoir of water, and you have a house nearby. To supply water to your house, you need to build a pipe from the reservoir to your home.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1729565833427/e1wh8-KeD.jpg?auto=format" alt="Reservoir of water connected to a house through a pipe" class="image--center mx-auto" /></p>
<p>P.S. I'm not a plumber, so don't take this drawing too seriously.</p>
<p>This analogy illustrates the three key parts of a stream:</p>
<ul>
<li><p>The water reservoir is a producer of water</p>
</li>
<li><p>The pipe is a stream that transfers water from the reservoir to your home</p>
</li>
<li><p>Your home is a consumer of water</p>
</li>
</ul>
<p>Coming back to Node.js streams. Let's compare the pipe analogy to how they behave:</p>
<ul>
<li><p>The pipe doesn't transfer the entire reservoir of water all at once</p>
</li>
<li><p>The pipe transfers water in pieces, each of a specific size that it can handle</p>
</li>
<li><p>The pipe is not interested in the water itself, and it's just a way to transfer it</p>
</li>
<li><p>The pipe is just a mechanism to transfer water from one place to another</p>
</li>
</ul>
<p>Looks pretty similar to Node.js streams, right?</p>
<h2 id="heading-when-are-nodejs-streams-used">When are Node.js streams used?</h2>
<p>Before going into the specific details of what streams are and how they work, let's first understand when they’re used.</p>
<h3 id="heading-real-time-data-processing">Real-time data processing</h3>
<p>Streams work great for processing when we deal with data that is partial or generated incrementally over time. Streams are highly effective for processing data that is generated incrementally or received in parts over time.</p>
<p>An ideal example of this is a WebSocket protocol. In short, it's a protocol that allows you to establish a two-way communication channel between the client and the server.</p>
<p>We'll get into more details on this protocol in the upcoming articles. We'll take the <a target="_blank" href="https://github.com/websockets/ws">WS</a> library as an example. It uses streams heavily. Here is an example where the abstraction called <code>Sender</code> <a target="_blank" href="https://github.com/websockets/ws/blob/019f28ff1ffddfcdc428d1de5ecd98648057a2ab/lib/sender.js#L558">implements a backpressure mechanism</a>.</p>
<p>We'll talk about the backpressure in the upcoming section. And it is just one example. You can explore the library further and see other use-cases.</p>
<h3 id="heading-network-interactions">Network interactions</h3>
<p>Every time you create a server using Node.js API, you're creating a duplex stream. HTTP module in Node.js uses the abstraction called <code>Scoket</code> to create a connection with a network socket. This <code>Socket</code> abstraction <a target="_blank" href="https://github.com/nodejs/node/blob/main/lib/net.js#L508">extends from</a> the <code>Duplex</code> stream.</p>
<pre><code class="lang-javascript">ObjectSetPrototypeOf(Socket.prototype, stream.Duplex.prototype);
ObjectSetPrototypeOf(Socket, stream.Duplex);
</code></pre>
<p>Whenever you see a construction like the following:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createServer } <span class="hljs-keyword">from</span> <span class="hljs-string">'http'</span>;

<span class="hljs-keyword">const</span> server = createServer();
</code></pre>
<p>Know that under the hood, you're creating a duplex stream.</p>
<h3 id="heading-working-with-large-datasets">Working with large datasets</h3>
<p>Imagine that you have a file that is 100GB in size. You need to parse it and process some data. How would you do it?</p>
<p>If you try to read the file using API, like <code>readFileSync</code> or <code>readFile</code> you'll crash your program.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { readFileSync, readFile } <span class="hljs-keyword">from</span> <span class="hljs-string">'fs'</span>;

<span class="hljs-keyword">const</span> largeFilePath = <span class="hljs-string">'path/to/large/file.txt'</span>;

<span class="hljs-comment">// Both of these will crash your program</span>
<span class="hljs-keyword">const</span> data = readFileSync(largeFilePath);
<span class="hljs-keyword">const</span> asyncData = <span class="hljs-keyword">await</span> readFile(largeFilePath);
</code></pre>
<p>The problem is that you're trying to load the whole file content into memory using these read APIs. Doesn't sound efficient at all. What we can do instead is to process the file's content in chunks.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { createReadStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'fs'</span>;

<span class="hljs-keyword">const</span> largeFilePath = <span class="hljs-string">'path/to/large/file.txt'</span>;
<span class="hljs-keyword">const</span> stream = createReadStream(largeFilePath);

stream.on(<span class="hljs-string">'data'</span>, <span class="hljs-function">(<span class="hljs-params">chunk</span>) =&gt;</span> {
  <span class="hljs-comment">// Process the chunk here</span>
});
</code></pre>
<p>With this approach, we're not waiting for the whole file to be loaded into memory. Whenever a chunk of data is ready, we're processing it.</p>
<h3 id="heading-data-transformation">Data transformation</h3>
<p>All previous examples were about the cases where we either read data from somewhere or write data to somewhere. But we can also use streams to transform data that we already have in memory.</p>
<p>A good example of this is data compression/decompression. Here is an example taken from the <a target="_blank" href="https://nodejs.org/api/zlib.html#zlib">zlib</a> module in Node.js documentation.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">do_gzip</span>(<span class="hljs-params">input, output</span>) </span>{
  <span class="hljs-keyword">const</span> gzip = createGzip();

  <span class="hljs-comment">// Create a read stream to read data from the input</span>
  <span class="hljs-keyword">const</span> source = createReadStream(input);

  <span class="hljs-comment">// Create a write stream to write data to the output</span>
  <span class="hljs-keyword">const</span> destination = createWriteStream(output);

  <span class="hljs-comment">// Pipe the source stream to the gzip stream,</span>
  <span class="hljs-comment">// then to the destination stream</span>
  <span class="hljs-keyword">await</span> pipe(source, gzip, destination); }
}
</code></pre>
<p>In this code snippet, we're creating a read stream, and whenever data comes from this read stream, we pass it down to the gzip. When the gzip stream compresses the data, we pass it down to the write stream.</p>
<p>You don't have to understand how this code works just yet. Just understand that streams can be used to transform different data.</p>
<h2 id="heading-dont-use-streams-in-this-case">Don't use streams in this case</h2>
<p>You don't want to use streams when the data you're working with is already in memory. There is just little to no benefit you can gain from using streams.</p>
<p>So please, try to avoid using streams when all pieces of data that you need are already in memory. Don't make your life harder.</p>
<h2 id="heading-core-concepts-on-nodejs-streams">Core concepts on Node.js streams</h2>
<p>You understand what streams are, when to use them, and when not to. Now, you're ready to dive deeper into some of the core concepts of streams in Node.js.</p>
<h3 id="heading-event-driven-architecture">Event-driven architecture</h3>
<p>You know that streams are like pipes. But what exactly makes them work this way? It is all thanks to even-driven concepts that streams are built upon. In particular, all streams in Node.js are <a target="_blank" href="https://github.com/nodejs/node/blob/main/lib/internal/streams/legacy.js#L14">extended from the <code>EventEmitter</code> class</a>.</p>
<p>The way <code>EventEmitter</code> works is very simple. It has some internal state where it stores all events and listeners of these events.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">class</span> EventEmitter {
  <span class="hljs-comment">// Map of events and their listeners</span>
  <span class="hljs-comment">// Each event can have multiple listeners</span>
  #events = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Map</span>&lt;<span class="hljs-built_in">string</span>, (<span class="hljs-function">() =&gt;</span> <span class="hljs-built_in">void</span>)[]&gt;();

  <span class="hljs-comment">// Register a new listener for the event</span>
  on(eventName: <span class="hljs-built_in">string</span>, callback: <span class="hljs-function">() =&gt;</span> <span class="hljs-built_in">void</span>) {
    <span class="hljs-keyword">if</span> (!<span class="hljs-built_in">this</span>.#events.has(eventName)) {
      <span class="hljs-built_in">this</span>.#events.set(eventName, [callback]);
    }

    <span class="hljs-built_in">this</span>.#events.get(eventName).push(callback);
  }

  <span class="hljs-comment">// Triggers all listeners related to the event.</span>
  emit(eventName: <span class="hljs-built_in">string</span>) {
    <span class="hljs-keyword">const</span> listeners = <span class="hljs-built_in">this</span>.#events.get(eventName);

    <span class="hljs-keyword">if</span> (!listeners) {
      <span class="hljs-keyword">return</span>;
    }

    listeners.forEach(<span class="hljs-function">(<span class="hljs-params">listener</span>) =&gt;</span> listener());
  }
}
</code></pre>
<p>It is a very simplified version, but it gives you an idea of how <code>EventEmitter</code> works. You can read the full implementation in the <a target="_blank" href="https://github.com/nodejs/node/blob/main/lib/events.js">Node.js source code</a>.</p>
<p>When you work with streams, you can add a listener to some predefined set of events.</p>
<pre><code class="lang-typescript">stream.on(<span class="hljs-string">'data'</span>, <span class="hljs-function">() =&gt;</span> {});
</code></pre>
<p>In this example, we add a listener to the <code>data</code> event. Whenever a chunk of data is ready, the stream calls the <code>emit</code> with the <code>data</code> event name, and all listeners are called.</p>
<p>It's the exact mechanism that makes streams work like pipes, where we get data from one end and pass it through to the other end.</p>
<h3 id="heading-backpressure">Backpressure</h3>
<p>Streams can be used to process large datasets efficiently. But there is a catch: what if the rate of data production is so high that at some point in time, we have more data in our program than allocated memory can handle? Right, the program will crash.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1729565914680/ZiEfHK9Dl.jpg?auto=format" alt="Example of how memory can overflow while transferring all data at once" class="image--center mx-auto" /></p>
<p>This means that just the abstraction of a stream is not enough to prevent such cases from happening. Streams have a backpressure mechanism in place for such cases.</p>
<p>Backpressure might sound like a fancy term, but in reality, it is quite simple. The main idea of backpressure is that we have some limit on how much data we can process at a time.</p>
<p>Let's get back to the example with reading a large file. There are 2 parts of this process that we're interested in: the producer of data and the consumer of data. The producer of data is the underlying OS mechanism that reads the file and produces the data.</p>
<p>If the producer tries to push too much data, a stream can signal to the producer that it needs to slow down because it can't take any more data at the moment. But how does the stream know when it's full?</p>
<p>Each stream has an internal buffer, and whenever new data comes in and the old one comes out, the "buffering" mechanism comes into play.</p>
<h3 id="heading-buffering">Buffering</h3>
<p>Each stream has an internal buffer. If we work with API that enables backpressure mechanism, then this buffer is used to store data that comes into the stream.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1729565955491/v6U27uumd.jpg?auto=format" alt="Illustration of how a stream internal buffer changes when data comes in and out" class="image--center mx-auto" /></p>
<p>If data comes into the stream but doesn't come out of the stream, the buffer steadily gets filled until it reaches the cap. The cap, in this case, is <code>highWaterMark</code> property set for each individual stream.</p>
<p>Here is an example of how we can set <code>highWaterMark</code> property when reading a file.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { createReadStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;

<span class="hljs-keyword">const</span> filePath = <span class="hljs-string">'path/to/file.txt'</span>;

<span class="hljs-keyword">const</span> writeStream = createReadStream(filePath, { highWaterMark: <span class="hljs-number">1024</span> });
</code></pre>
<p>The <code>highWaterMark</code> is set to 64KB for <code>createReadStream</code> function by default. When the internal buffer frees up some space, the stream can start reading more data from the source.</p>
<h3 id="heading-piping-and-chaining">Piping and chaining</h3>
<p>In more or less complex Node.js applications you'll need to transform data that comes from a stream or send this data to some other destination. In cases like this, a concept called as "piping" comes useful.</p>
<p>You can create a chain of streams where one stream is connected to another stream and whenever data comes into the first stream in the chain it goes through the whole chain of streams. If you're familiar with reactive programming and things like RxJS, then this concept should be familiar to you.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { createReadStream, createWriteStream } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;
<span class="hljs-keyword">import</span> { createGzip } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:zlib'</span>;
<span class="hljs-keyword">import</span> { pipeline } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:stream'</span>;

<span class="hljs-keyword">const</span> source = createReadStream(<span class="hljs-string">'path/to/file.txt'</span>);
<span class="hljs-keyword">const</span> destination = createWriteStream(<span class="hljs-string">'path/to/file.txt.gz'</span>);
<span class="hljs-keyword">const</span> gzip = createGzip();

<span class="hljs-keyword">await</span> pipeline(source, gzip, destination);
</code></pre>
<p>In this example the <code>source</code> stream triggers the whole pipeline. It goes like this:</p>
<ol>
<li><p><code>source</code> stream reads data from the file</p>
</li>
<li><p><code>source</code> stream passes this data to the <code>gzip</code> stream</p>
</li>
<li><p><code>gzip</code> stream compresses the data</p>
</li>
<li><p><code>gzip</code> stream passes the compressed data to the <code>destination</code> stream</p>
</li>
<li><p><code>destination</code> stream writes the compressed data to the file</p>
</li>
<li><p>The whole pipeline is finished</p>
</li>
</ol>
<p>Every stage of the pipeline has its own internal buffer and backpressure mechanism. It means that if the <code>gzip</code> stream can't handle the data that comes from the <code>source</code> stream, it can signal to the <code>source</code> stream to slow down. The same thing goes for the <code>destination</code> stream.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Streams are at the heart of any Node.js application, whether you use them explicitly or not. It is also one of the most powerful features existing in Node.js. Streams are used in many different places in Node.js, from network interactions to file processing.</p>
<p>They are especially useful when you need to process large datasets or work with real-time data. The core mental model of streams is built around the following concepts:</p>
<ol>
<li><p>Data over time</p>
</li>
<li><p>Event-driven architecture</p>
</li>
<li><p>Backpressure</p>
</li>
<li><p>Buffering</p>
</li>
<li><p>Piping and chaining</p>
</li>
</ol>
<p>By understanding these concepts and having a clear picture of how streams operate at a conceptual level, you can build more efficient Node.js apps.</p>
]]></content:encoded></item><item><title><![CDATA[Profiling Node.js application with VS Code]]></title><description><![CDATA[Profiling your Node.js applications could be exhausting, especially when you have to switch between different tools to get a full picture of your app's performance.
The constant switching of contexts can kill your productivity.
What if I tell you tha...]]></description><link>https://pavel-romanov.com/profiling-nodejs-application-with-vs-code</link><guid isPermaLink="true">https://pavel-romanov.com/profiling-nodejs-application-with-vs-code</guid><category><![CDATA[Node.js]]></category><category><![CDATA[JavaScript]]></category><category><![CDATA[Web Development]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Thu, 03 Oct 2024 03:45:53 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1727927089387/b48d3d29-77bc-4bca-9b6f-8db855c463c0.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Profiling your Node.js applications could be exhausting, especially when you have to switch between different tools to get a full picture of your app's performance.</p>
<p>The constant switching of contexts can kill your productivity.</p>
<p>What if I tell you that it doesn't have to be that way? What if you could perform all the necessary profiling routine within the same workspace you're already using for coding?</p>
<p>In this article, we'll explore how to use the VS Code built-in debugger to profile and troubleshoot common performance issues in your Node.js application.</p>
<p>You'll be surprised how much you can do in terms of profiling by just using VS Code.</p>
<h2 id="heading-setup">Setup</h2>
<p>To illustrate the profiling process, we'll need some code. I've created a <a target="_blank" href="https://github.com/pavel-romanov8/nodejs-profiling-examples">GitHub repository</a> that contains common performance issues you might encounter in your Node.js application.</p>
<p>The repository contains a simple Node.js application with three routes, each designed to demonstrate a specific performance issue.</p>
<ul>
<li><p>A CPU-intensive task that blocks the main thread.</p>
</li>
<li><p>An asynchronous operation with the waterfall problem, where execution goes sequentially instead of in parallel.</p>
</li>
<li><p>A memory leak.</p>
</li>
</ul>
<p>Each route has two implementations: one with the problem that can be spotted using the VS Code profiler, and the other is an optimized version offering with the same functionality.</p>
<p>I encourage you to clone the repository and explore the code to better understand the topics we're about to discuss.</p>
<h2 id="heading-profiling">Profiling</h2>
<p>Now that we've set up the project let's explore how profiling in VS Code works.</p>
<p>Before diving into the specific problems, I want to mention that VS Code generates a profiling report after each profiling session. This profiling report can be viewed in 2 different ways:</p>
<ul>
<li><p>Table</p>
</li>
<li><p>Flamegraph</p>
</li>
</ul>
<p>While the table view is built-in, the flamegraph view requires a <a target="_blank" href="https://marketplace.visualstudio.com/items?itemName=ms-vscode.vscode-js-profile-flame">separate flamegrap extension</a> to enable it.</p>
<p>Having multiple ways to visualize your data leads to better understanding. You can catch insights using one type of view that are hard to notice using the other type.</p>
<h3 id="heading-cpu-intensive-endpoint">CPU-intensive endpoint</h3>
<p>We start with the CPU-intensive endpoint. The main problem behind every CPU-intensive operation in JavaScript is that it blocks the execution thread. Other tasks can hardly make any progress while a CPU-intensive operation is running.</p>
<p>Sure, it can be solved by moving this task into a dedicated thread, but more often than not, you can avoid it by using a more efficient algorithm or data structure.</p>
<p>In our case, there are two implementations of this endpoint: one with a high CPU load and the other without it.</p>
<p>Let's look at implementation with high CPU consumption first.</p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runCpuIntensiveTask</span>(<span class="hljs-params">cb</span>) </span>{
  <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">fibonacciRecursive</span>(<span class="hljs-params">n</span>) </span>{
    <span class="hljs-keyword">if</span> (n &lt;= <span class="hljs-number">1</span>) {
      <span class="hljs-keyword">return</span> n;
    }
    <span class="hljs-keyword">return</span> fibonacciRecursive(n - <span class="hljs-number">1</span>) + fibonacciRecursive(n - <span class="hljs-number">2</span>);
   }
  fibonacciRecursive(<span class="hljs-number">45</span>);
  cb();
}
</code></pre>
<p>Here is the code that does the same thing in terms of functionality but consumes way less CPU resources.</p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runSmartCpuIntensiveTask</span>(<span class="hljs-params">cb</span>) </span>{
  <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">fibonacciIterative</span>(<span class="hljs-params">n</span>) </span>{
    <span class="hljs-keyword">if</span> (n &lt;= <span class="hljs-number">1</span>) {
      <span class="hljs-keyword">return</span> n;
    }
    <span class="hljs-keyword">let</span> prev = <span class="hljs-number">0</span>, curr = <span class="hljs-number">1</span>;
    <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">2</span>; i &lt;= n; i++) {
      <span class="hljs-keyword">const</span> next = prev + curr;
      prev = curr;
      curr = next;
    }
    <span class="hljs-keyword">return</span> curr;
  }
  fibonacciIterative(<span class="hljs-number">45</span>)
  cb();
}
</code></pre>
<p>Both versions calculate the 45th Fibonacci number. The first implementation uses a recursive, CPU-intensive approach, while the second one employs an iterative, more efficient approach.</p>
<p>To start the profiling session in VS Code, you should follow these steps:</p>
<ol>
<li>Open the Debugger tab, typically located on the left panel.</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727879834487/20f2019a-7a32-4259-a787-643b083cad90.png" alt="Debugger view" class="image--center mx-auto" /></p>
<ol start="2">
<li><p>Choose the script you want to execute in the "Run and Debug" section.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727879869222/5fae0e0e-8606-427f-85b7-06ac85688eae.png" alt="Example of how to select debugger runner" class="image--center mx-auto" /></p>
</li>
<li><p>Choose the appropriate profiling option. For CPU-intensive endpoint, we select "CPU profile".</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727879896672/5f618c15-7808-465f-8400-9b5ef422d84d.png" alt="Take performance profile button in the call stack section" class="image--center mx-auto" /></p>
</li>
<li><p>Navigate to the "Call stack" section and click the "Take performance profile" button.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727879990734/b2b26a51-2441-44e3-afe9-0d2f8164965f.png" alt="Profiling option in VS Code debugger session" class="image--center mx-auto" /></p>
</li>
<li><p>Choose the run option. For simplicity, we'll use the "Manual" option.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727880082944/417eb16c-3ff6-4502-b359-592a582ac344.png" alt="VS Code profiler run options" class="image--center mx-auto" /></p>
</li>
</ol>
<p>After going through all these steps, we're ready to start profiling.</p>
<p>Since the Node.js server is already running we only have to send a request to the CPU-intensive endpoint. We'll start with the implementation that consumes a lot of CPU resources and see if we can identify the problem just by looking at the profiling report.</p>
<p>Here are the profiler entries after sending the request:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727880159629/45097704-053f-411f-9508-46259db88e33.png" alt class="image--center mx-auto" /></p>
<p>As you can see, it is pretty easy to identify the bottleneck, it is the fibonacci function.</p>
<p>The results are presented in the table view. If you prefer a visual representation, you can switch to the flamegraph by clicking the flame button.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727880600909/0417e880-6060-4c8a-9ae4-dca9bd52a03b.png" alt="VS Code profiler flamegraph button" class="image--center mx-auto" /></p>
<p>Remember, this button is only available after installing the <a target="_blank" href="https://marketplace.visualstudio.com/items?itemName=ms-vscode.vscode-js-profile-flame">flamegrap extension</a>.</p>
<p>After clicking this button, you will see a picture similar to the timeline and entries.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727880674750/4878885c-7a2e-4bae-8e17-cf6a9567e9e4.png" alt="VS Code profiler flamegraph view" class="image--center mx-auto" /></p>
<p>Now that we've identified the problem, let's replace the recursive implementation with an iterative one and profile the improved version of the CPU-intensive endpoint.</p>
<p>After sending the request to the endpoint with the improved fibonacci function we see the following results:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727925146161/52d4f1a6-86cb-4e52-a18c-2d53d579875e.png" alt="Profiling results of improved CPU-intensive endpoint in table view" class="image--center mx-auto" /></p>
<p>The fibonacci function is not even close to the top 10 profiling entries. If we open the same profiling report in the flamegrap view you'll see that now it takes less than 10ms to execute.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727925182974/53e7e0eb-7037-4e05-9c53-89ce2ed76832.png" alt="Profiling results of improved CPU-intensive endpoint in flamegraph view" class="image--center mx-auto" /></p>
<p>Compare this 10ms of execution time to the previous 6.5sec. We can clearly see performance gains.</p>
<h3 id="heading-async-endpoint">Async endpoint</h3>
<p>Next, let's explore how to use VS Code's profiler to identify and resolve issues with asynchronous code execution.</p>
<p>Here's an asynchronous function that simulates a time-consuming operation:</p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">generateAsyncOperation</span>(<span class="hljs-params"></span>) </span>{
 <span class="hljs-keyword">return</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Promise</span>(<span class="hljs-function"><span class="hljs-params">resolve</span> =&gt;</span> {
   <span class="hljs-built_in">setTimeout</span>(<span class="hljs-function">() =&gt;</span> {
     <span class="hljs-comment">// Simulate a time-consuming asynchronous operation</span>
     <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>; i &lt; <span class="hljs-number">50000000</span>; i++) { }
       resolve();
   }, <span class="hljs-number">1000</span>);
  });
}
</code></pre>
<p>For the sake of the example, we're running the <code>for</code> loop inside of the <code>setTimeout</code> callback just to make things easier to see in the profiler report.</p>
<p>First implementation of the asynchronous endpoint suffer from the waterfall problem where independent asynchronous functions are executed sequentially, causing details as each function waits for the previous one to complete.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runAsyncTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-keyword">await</span> generateAsyncOperation();
 <span class="hljs-keyword">await</span> generateAsyncOperation();
 <span class="hljs-keyword">await</span> generateAsyncOperation();
 cb();
}
</code></pre>
<p>Since these asynchronous functions are independent, we don't need to run them sequentially. Instead, we can run them concurrently.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runSmartAsyncTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-keyword">await</span> <span class="hljs-built_in">Promise</span>.all(<span class="hljs-keyword">new</span> <span class="hljs-built_in">Array</span>(<span class="hljs-number">3</span>).fill().map(<span class="hljs-function">() =&gt;</span> generateAsyncOperation()));
 cb();
}
</code></pre>
<p>By using <code>Promise.all</code> we can run all 3 functions simultaneously.</p>
<p>Now that we've explored the code let's see how profiling can help us identify and address the waterfall problem. You start the debugging session in the same way as we did it for the CPU-intensive endpoint:</p>
<ol>
<li><p>Open the Debugger tab</p>
</li>
<li><p>In the "Run and Debug" section, choose the script you want to execute.</p>
</li>
<li><p>Navigate to the "Call stack" section and click the "Take performance profile" button.</p>
</li>
<li><p><strong>Select the "CPU profile" as a profiling option.</strong></p>
</li>
<li><p>Choose the "Manual" run option.</p>
</li>
</ol>
<p>Let's start by profiling the asynchronous endpoint with the sequential implementation. After sending a request and generating the profiling report, we see the following picture:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727925491436/bba6bcff-489f-4e67-b765-70fe13dca905.png" alt="VS Code profiling of sequential asynchronous endpoint with sequential execution in a flamegraph view" class="image--center mx-auto" /></p>
<p>Notice those 3 pink entries on the flamegraph. Each one of those entries represents the execution of the <code>generateAsyncOperation</code> function.</p>
<p>The time span from the first entry to the last one is almost 2 seconds.</p>
<p>It takes almost 2 seconds from the first entry to the last one. Only after the last entry completes its executed we can get the response from the server.</p>
<p>After identifying the problem we can replace the sequential implementation with the optimized parallel version.</p>
<p>When you finish profiling the new endpoint implementation, the generated profiling report will surprise you.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727925694129/57b89233-f478-40cc-8ef1-bdaea26a675a.png" alt="VS Code profiling of asynchronous endpoint with parallel execution in a flamegraph view" class="image--center mx-auto" /></p>
<p>Instead of 3 distinct entries there is only 1 representing the concurrent execution of all three asynchronous operations. It takes less than 100ms to complete the requests and return the result.</p>
<h3 id="heading-memory-leak-endpoint">Memory leak endpoint</h3>
<p>The last type of problems that we'll look into is memory leaks.</p>
<p>Here is how code containing memory leak looks like:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">const</span> memoryLeak = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Map</span>();

<span class="hljs-comment">// Function with a memory leak</span>
<span class="hljs-keyword">export</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runMemoryLeakTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>; i &lt; <span class="hljs-number">10000</span>; i++) {
   <span class="hljs-keyword">const</span> person = {
     <span class="hljs-attr">name</span>: <span class="hljs-string">`Person number <span class="hljs-subst">${i}</span>`</span>,
     <span class="hljs-attr">age</span>: i,
   };
   memoryLeak.set(person, <span class="hljs-string">`I am a person number <span class="hljs-subst">${i}</span>`</span>);
 }
 cb();
}
</code></pre>
<p>In this case we assume that data from one request is not required for subsequent requests. Therefore, if any data persists between requests we consider it a memory leak.</p>
<p>Here's a modified version of the code without the memory leak:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">const</span> smartMemoryLeak = <span class="hljs-keyword">new</span> <span class="hljs-built_in">WeakMap</span>();

<span class="hljs-comment">// Function without a memory leak</span>
<span class="hljs-keyword">export</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runSmartMemoryLeakTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>; i &lt; <span class="hljs-number">10000</span>; i++) {
   <span class="hljs-keyword">const</span> person = {
     <span class="hljs-attr">name</span>: <span class="hljs-string">`Person number <span class="hljs-subst">${i}</span>`</span>,
     <span class="hljs-attr">age</span>: i,
   };
   smartMemoryLeak.set(person, <span class="hljs-string">`I am a person number <span class="hljs-subst">${i}</span>`</span>);
 }
 cb();
}
</code></pre>
<p>The main difference between these 2 implementations is the type of data used to store the objects. The <code>Map</code> in the first example keeps strong references to the object preventing garbage collection even when the object are no longer needed.</p>
<p>In contrast, the <code>WeakMap</code> in the second example uses weak references making objects available for garbage collection if the objects themselves are no longer referenced anywhere except the <code>WeakMap</code> itself.</p>
<p>We're ready to start profiling. The steps are the same except for the type of profiling we're running. The only difference is that instead of "CPU profile" we use "Heap Profile" option.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727926304176/ce46beb3-b7ee-45a6-93ab-b6d8589e96ae.png" alt="VS Code Heap Profile option from the profiling window" class="image--center mx-auto" /></p>
<p>To clearly demonstrate the memory leak, let's send 4 requests to both endpoint implementations and compare the results.</p>
<p>Here is the profiling report after sending 4 requests to the endpoint that uses <code>Map</code> to store objects.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727926601885/8b484ec1-ef87-4658-953d-fbcc7d947c5c.png" alt class="image--center mx-auto" /></p>
<p>After the 4 requests, the program using <code>Map</code> occupies around 6mb of memory. While it might not seem like much, let's compare it to the implementation which uses <code>WeakMap</code>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727926625806/5410f935-41e8-4949-84ac-697257a1c4d6.png" alt="Result of profiling endpoint without memory leak using VS Code heap profiler" class="image--center mx-auto" /></p>
<p>The program that uses <code>WeakMap</code> occupies only 350kb of memory after 4 requests. This is less than half a megabyte. The result is that the program occupies 16 times less memory space.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>While VS Code might not have all the advanced features of dedicated profiling tools like DevTools or is not as smart as Clinic.js, it is still a solid option for profiling your Node.js applications.</p>
<p>Especially because you don't need to download any external libraries or connect to external tools and systems. Everything just works within your coding environment, so you stay focused on the goal.</p>
<p>If you want to learn more about other profiling options, I highly recommend reading the previous articles about <a target="_blank" href="https://pavel-romanov.com/how-to-profile-nodejs-apps-using-chrome-devtools">how to profile Node.js apps using Chrome DevTools</a> and <a target="_blank" href="https://pavel-romanov.com/optimizing-nodejs-identifying-and-fixing-performance-problems-with-clinic">how smart Clinic.js can help you understand profiling reports better</a>.</p>
]]></content:encoded></item><item><title><![CDATA[Building Semaphore and Mutex in Node.js]]></title><description><![CDATA[In the previous article, we talked about Atomics in Node.js and the problems they solve in multithreaded programs. While Atomics API is powerful, it is not always convenient to work with it simply because it is just too low level.
Other programming l...]]></description><link>https://pavel-romanov.com/building-semaphore-and-mutex-in-nodejs</link><guid isPermaLink="true">https://pavel-romanov.com/building-semaphore-and-mutex-in-nodejs</guid><category><![CDATA[JavaScript]]></category><category><![CDATA[Node.js]]></category><category><![CDATA[Web Development]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Fri, 06 Sep 2024 16:35:54 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1725638459862/ecffbb0f-db7d-4af9-be79-d1b02b6ee31d.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the previous article, we talked about <a target="_blank" href="https://pavel-romanov.com/multithreading-in-nodejs-using-atomics-for-safe-shared-memory-operations">Atomics in Node.js</a> and the problems they solve in multithreaded programs. While <code>Atomics</code> API is powerful, it is not always convenient to work with it simply because it is just too low level.</p>
<p>Other programming languages like Golang have higher-level synchronization primitives built into their standard libraries to control access to shared resources between multiple threads.</p>
<p>Although JavaScript doesn't have any of such primitives except <code>Atomics</code>, we can build them from scratch based on what we see in other languages.</p>
<p>In this article, we'll do exactly that. We'll implement two of the most popular synchronization primitives: Semaphore and Mutex. We'll explore the differences between these primitives and discuss when to use each one.</p>
<h2 id="heading-understanding-what-critical-section-is">Understanding what critical section is</h2>
<p>One of the main problems in multithreaded programs is shared resources that can be directly accessed from multiple threads. This is where the concept of a <em>critical section</em> becomes increasingly important.</p>
<p>Any code that accesses the shared resources is considered a critical section. Note that not only write operations but even read operations can be considered as a part of a critical section.</p>
<p>It is important to clearly identify a critical section to write code that runs safely in a multithreaded environment.</p>
<p>Let's look at a code example from the previous article about <code>Atomics</code> in Node.js.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { isMainThread, Worker, workerData, threadId } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:worker_threads'</span>;

<span class="hljs-keyword">if</span> (isMainThread) {
  <span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> SharedArrayBuffer(<span class="hljs-number">10</span>);
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename, { <span class="hljs-attr">workerData</span>: buffer });
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename, { <span class="hljs-attr">workerData</span>: buffer });
} <span class="hljs-keyword">else</span> {

  <span class="hljs-comment">// Critical section</span>
  <span class="hljs-keyword">const</span> typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int8Array</span>(workerData);
  <span class="hljs-keyword">const</span> value = Atomics.store(typedArray, <span class="hljs-number">0</span>, threadId);
  <span class="hljs-built_in">console</span>.dir({ threadId, value });
  <span class="hljs-comment">// End of critical section</span>
}
</code></pre>
<p>In this example, we can consider the whole <code>else</code> block as a critical section, and here is why:</p>
<ol>
<li><p>We create a view into the shared resources, which is basically an indicator of intention to interact with the shared resources. If you're not familiar with views, I recommend reading one of the previous articles about <a target="_blank" href="https://pavel-romanov.com/uint8array-vs-dataview-choosing-the-right-buffer-view-in-javascript">what buffer views are and how to use them in JavaScript</a>.</p>
</li>
<li><p>We directly <strong>mutate</strong> the shared buffer using <code>Atomics.store</code> function.</p>
</li>
<li><p>We <strong>read</strong> from the buffer and print this value into the console.</p>
</li>
</ol>
<p>If we're not modifying the shared buffer inside of a worker thread, then it is not considered a critical section simply because there is no way we can run into race conditions based on such logic.</p>
<p>I think you'll agree with me that it is kind of hard to track such critical sections simply by the type of operations we perform on a specific set of resources.</p>
<p>In other languages, it is fairly easy to mark some sections as critical using thread locks. Let's see how it works and compare it to Node.js.</p>
<h2 id="heading-compare-thread-locks-from-golang-with-nodejs">Compare thread locks from Golang with Node.js</h2>
<p>The problem of access to shared resources across multiple execution contexts is not new, and other languages and platforms have their own way of dealing with this problem. Let's look at an example of how we can do it in <em>Golang</em>.</p>
<h3 id="heading-thread-locks-in-golang">Thread locks in Golang</h3>
<p>Here's an example of how Golang uses a mutex to protect shared resources:</p>
<pre><code class="lang-go"><span class="hljs-keyword">type</span> SafeCounter <span class="hljs-keyword">struct</span> {
    mu sync.Mutex
    v  <span class="hljs-keyword">map</span>[<span class="hljs-keyword">string</span>]<span class="hljs-keyword">int</span>
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(c *SafeCounter)</span> <span class="hljs-title">Inc</span><span class="hljs-params">(key <span class="hljs-keyword">string</span>)</span></span> {
    c.mu.Lock()
    <span class="hljs-comment">// Lock so only one goroutine at a time can access the map c.v.</span>
    c.v[key]++
    c.mu.Unlock()
}
</code></pre>
<p>Even if you're not familiar with Golang, the logic should be pretty clear:</p>
<ol>
<li><p><code>c.mu.Lock()</code> marks the beginning of a critical section</p>
</li>
<li><p><code>c.v[key]++</code> is the actual operation over the shared resources</p>
</li>
<li><p><code>c.mu.Unlock()</code> marks the end of a critical section</p>
</li>
</ol>
<p>When you look at this code, it is fairly easy to identify where operations over shared resources are taking place. You don't have to think about what kind of operations you're running and over what resources.</p>
<h3 id="heading-atomics-in-nodejs">Atomics in Node.js</h3>
<p>It is possible to create such locks in Node.js simply because of how JavaScript and Node.js as a is designed. The closest thing we can get is to mark such sections with <code>Atomics</code>.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { isMainThread, Worker, workerData, threadId } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:worker_threads'</span>;

<span class="hljs-keyword">if</span> (isMainThread) {
  <span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> SharedArrayBuffer(<span class="hljs-number">12</span>);
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename, { <span class="hljs-attr">workerData</span>: buffer });
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename, { <span class="hljs-attr">workerData</span>: buffer });
} <span class="hljs-keyword">else</span> {
  <span class="hljs-keyword">const</span> typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int32Array</span>(workerData);
  <span class="hljs-keyword">const</span> itemIndex = <span class="hljs-number">0</span>;

  <span class="hljs-keyword">if</span> (threadId !== <span class="hljs-number">1</span>) {
    Atomics.wait(typedArray, itemIndex, <span class="hljs-number">0</span>);
  }

  <span class="hljs-keyword">const</span> value = Atomics.store(typedArray, itemIndex, threadId);

  Atomics.notify(typedArray, itemIndex);
  <span class="hljs-built_in">console</span>.dir({ threadId, value });
}
</code></pre>
<p>As you can see, it is not even close to Golang. There are simply too many things you have to be aware of to make things work properly in a very basic use case.</p>
<ol>
<li><p>Use <code>Atomics.wait</code> everytime to stop a thread execution until some condition has been met.</p>
</li>
<li><p>Create <code>Int32Array</code> to manage concurrent access to shared resources.</p>
</li>
<li><p>Call <code>Atomics.notify</code> to let other sleeping threads know they can proceed with execution.</p>
</li>
</ol>
<p>From one side, it might look like a reasonable limitation for high-level language that runs in an end-user environment such as browser.</p>
<p>However, when comparing such API with what Golang or other languages have to offer, it is not nearly as convenient as it could be.</p>
<p>We can fix that by creating similar abstractions from other languages ourselves. While they still may not be 100% perfect because of the language and the platform limitations, we can get pretty close to what we see in other languages.</p>
<p>That said, let's move forward and implement 2 of the most popular synchronization primitives: Semaphore and Mutex.</p>
<h2 id="heading-building-semaphore-and-mutex">Building Semaphore and Mutex</h2>
<p>Both semaphores and mutexes are meant to solve the same problem - limit access to shared resources. At the same time, there are nuances in how they are doing so which we'll dive into next.</p>
<h3 id="heading-semaphore">Semaphore</h3>
<p>In simple words, semaphore is an abstraction that allows <strong>multiple</strong> threads to work with the same shared resources. Semaphores are based on a counting mechanism. You can specify how many threads you're willing to have access to the shared resources.</p>
<p>It could be as little as 1 or as many as you want. Semaphore where only 1 thread is allowed to access the resources is called a binary semaphore.</p>
<p>Let's walk through the creation of semaphore step by step to better understand how it works. First, we create the class.</p>
<pre><code class="lang-javascript"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Semaphore</span> </span>{
  <span class="hljs-keyword">constructor</span>(buffer, maxCount) {
    <span class="hljs-built_in">this</span>.typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int32Array</span>(buffer);
  }
}
</code></pre>
<p>We want to pass only 2 arguments: an instance of <code>SharedArrayBuffer</code> and the number which sets the limit to the number of threads that can access the buffer at the same time.</p>
<p>Here is how we'll use the <code>Semaphore</code> class:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { Worker, isMainThread, workerData, threadId } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:worker_threads'</span>;

<span class="hljs-keyword">if</span> (isMainThread) {
  <span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> SharedArrayBuffer(<span class="hljs-number">12</span>);
  <span class="hljs-keyword">const</span> semaphore = <span class="hljs-keyword">new</span> Semaphore(buffer, <span class="hljs-number">5</span>);
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename, { <span class="hljs-attr">workerData</span>: buffer });
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename, { <span class="hljs-attr">workerData</span>: buffer });
} <span class="hljs-keyword">else</span> {
  <span class="hljs-keyword">const</span> semaphore = <span class="hljs-keyword">new</span> Semaphore(workerData);

  <span class="hljs-comment">// Rest of the code</span>
}
</code></pre>
<p>We create <code>Semaphore</code> instance in every thread simply because there is no way to share the same object reference across multiple threads by design.</p>
<p>The difference here is how we create <code>Semaphore</code> instance. We only want to pass the <code>maxCount</code> argument in the main thread because only the main thread must dictate how many threads can access the underlying buffer.</p>
<p>The next step is to set up the shared counter:</p>
<pre><code class="lang-javascript"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Semaphore</span> </span>{
  <span class="hljs-keyword">constructor</span>(buffer, maxCount) {
    <span class="hljs-built_in">this</span>.typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int32Array</span>(buffer);

    <span class="hljs-keyword">if</span> (maxCount !== <span class="hljs-literal">undefined</span>) {
      Atomics.store(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, maxCount);
    }
  }
}
</code></pre>
<p>The idea here is to have the first element of the array as some sort of counter. Whenever a new thread comes in, we want to decrement this counter. When this thread releases the semaphore, we want to increment this counter.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725639172643/48387acf-ef49-4baa-8e95-23510ff4ae43.jpeg" alt class="image--center mx-auto" /></p>
<p>It is important to have this counter as the first element of the buffer because the buffer is the only thing that is shared between all threads and changes made in one thread are visible in the other.</p>
<p>It is time to implement the <code>acquire</code> method. When using this method, the shared counter should be decremented. If the number of threads that can access the shared resources has reached the limit, we want to completely stop the thread execution until the semaphore signals that we can move forward.</p>
<pre><code class="lang-javascript"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Semaphore</span> </span>{
  <span class="hljs-keyword">constructor</span>(buffer, maxCount) {
    <span class="hljs-built_in">this</span>.typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int32Array</span>(buffer);

    <span class="hljs-keyword">if</span> (maxCount !== <span class="hljs-literal">undefined</span>) {
      Atomics.store(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, maxCount);
    }
  }

  acquire() {
    <span class="hljs-keyword">while</span> (<span class="hljs-literal">true</span>) {
      <span class="hljs-keyword">const</span> value = Atomics.load(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>);
      <span class="hljs-keyword">if</span> (value === <span class="hljs-number">0</span>) {
        Atomics.wait(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>);
        <span class="hljs-keyword">continue</span>;
      }
      <span class="hljs-keyword">if</span> (Atomics.compareExchange(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, value, value - <span class="hljs-number">1</span>) === value) {
        <span class="hljs-keyword">return</span>;
      }
    }
  }
}
</code></pre>
<p>Let's walk through the implementation step by step and understand how it works.</p>
<p>We'll leave the while loop at the end.</p>
<p>After we enter the loop, the first thing we do is load the current counter using <code>Atomics.load</code> function.</p>
<p>If the counter is <code>0</code> it means that there is no room for one more thread to access the resources. Therefore, we want to wait until the counter has a value higher than <code>0</code>. We're doing so by using <code>Atomics.wait</code> function.</p>
<p>If the value is not <code>0</code>, we want to ensure that the value we've loaded with <code>Atomics.load</code> function is indeed the previous value that the counter contains.</p>
<p>Using <code>Atomics.compareExchange</code> we only replace a new value <code>value - 1</code> at a given index if the expected value <code>value</code> is equal to the value stored in the counter at the moment of calling this function.</p>
<p>The reason why we're doing so is because multiple threads can go through the same steps at the same time. They can enter the function at the same time, load the value at the same time, and start the <code>compareExchange</code> function at the same time.</p>
<pre><code class="lang-javascript">  <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">acquire</span>(<span class="hljs-params"></span>) </span>{
    <span class="hljs-keyword">while</span> (<span class="hljs-literal">true</span>) {

      <span class="hljs-comment">// Multiple threads can load the</span>
      <span class="hljs-comment">// current counter at the same time</span>

      <span class="hljs-keyword">const</span> value = Atomics.load(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>);

      <span class="hljs-comment">// Those multiple threads can then run</span>
      <span class="hljs-comment">// this check at the same time and pass it</span>

      <span class="hljs-keyword">if</span> (value === <span class="hljs-number">0</span>) {
        Atomics.wait(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>);
        <span class="hljs-keyword">continue</span>;
      }

      <span class="hljs-comment">// However, only one of those threads can actually set</span>
      <span class="hljs-comment">// the value using `compareExchange` at a time</span>

      <span class="hljs-keyword">if</span> (Atomics.compareExchange(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, value, value - <span class="hljs-number">1</span>) === value) {
        <span class="hljs-keyword">return</span>;
      }
    }
  }
</code></pre>
<p>Even when multiple threads are running <code>compareExchange</code>, only one of them can actually write the value. After the first thread is done writing the value, all the others will fail to match the condition in the <code>if</code> block.</p>
<p>After that, the <code>while</code> loop comes into play. We want to repeat this action until all the cases match the <code>if</code> block condition and exit the <code>acquire</code> function.</p>
<p>It might sound a bit strange that we need to run an infinite loop to keep things working, but threads spend most of the time in sleep mode, so there is no heavy load on the CPU.</p>
<p>Coming back to implementation. The last step would be to implement the <code>release</code> method.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">export</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Semaphore</span> </span>{
  <span class="hljs-keyword">constructor</span>({ buffer, maxCount }) {
    <span class="hljs-built_in">this</span>.typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int32Array</span>(buffer);

    <span class="hljs-keyword">if</span> (maxCount != <span class="hljs-literal">null</span> ) {
      <span class="hljs-built_in">this</span>.maxCount = maxCount;
      Atomics.store(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, <span class="hljs-built_in">this</span>.maxCount);
    }
  }

  acquire() {
    <span class="hljs-keyword">while</span> (<span class="hljs-literal">true</span>) {
      <span class="hljs-keyword">const</span> value = Atomics.load(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>);
      <span class="hljs-keyword">if</span> (value === <span class="hljs-number">0</span>) {
        Atomics.wait(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>);
        <span class="hljs-keyword">continue</span>;
      }
      <span class="hljs-keyword">if</span> (Atomics.compareExchange(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, value, value - <span class="hljs-number">1</span>) === value) {
        <span class="hljs-keyword">return</span>;
      }
    }
  }

  release() {
    Atomics.add(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>);
    Atomics.notify(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>);
  }
}
</code></pre>
<p>Whenever the <code>release</code> method is called we increment the shared counter by <code>1</code> and notify only a <strong>single</strong> sleeping thread that the value has been changed, and it can try to access the shared resources.</p>
<p>The semaphore is completed, and we can now use it.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { isMainThread, Worker, workerData, threadId } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:worker_threads'</span>;

<span class="hljs-keyword">if</span> (isMainThread) {
  <span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> SharedArrayBuffer(<span class="hljs-number">10</span>);
  <span class="hljs-keyword">const</span> semaphore = <span class="hljs-keyword">new</span> Semaphore(buffer, <span class="hljs-number">1</span>);
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename, { <span class="hljs-attr">workerData</span>: buffer });
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename, { <span class="hljs-attr">workerData</span>: buffer });
} <span class="hljs-keyword">else</span> {
  <span class="hljs-keyword">const</span> typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int8Array</span>(workerData);
  <span class="hljs-keyword">const</span> semaphore = <span class="hljs-keyword">new</span> Semaphore(workerData);

  <span class="hljs-comment">// Critical section</span>
  semaphore.acquire();
  typedArray[<span class="hljs-number">4</span>] = threadId;
  <span class="hljs-built_in">console</span>.dir({ threadId, <span class="hljs-attr">value</span>: typedArray[<span class="hljs-number">4</span>] });
  semaphore.release();
  <span class="hljs-comment">// End of critical section</span>
}
</code></pre>
<p>Notice how we're using it. It is almost the same API as we've seen in Golang. Just by calling <code>acquire</code> and <code>release</code> functions, we can clearly mark the boundaries of a critical section.</p>
<p>And the best part is that we don't even need to explicitly use <code>Atomics</code>. <code>Semaphore</code> guarantees a thread-safe environment in between <code>acquire</code> and <code>release</code> calls.</p>
<p>Of course, if we increase the number of maximum threads that can operate over the resources from one to 2 or more, then we're still prone to race conditions.</p>
<h3 id="heading-mutex">Mutex</h3>
<p>Mutex stands for "mutual exclusion". It is a synchronization mechanism that allows <strong>only one</strong> thread to access shared resources at a time, unlike semaphore, where we can configure a number of threads that have access to the shared resources.</p>
<p>Another feature of mutex is ownership. If a thread locks a mutex, it means that this thread becomes the owner of the shared resources, and only this thread can unblock other threads and make the shared resources available.</p>
<p>When working with semaphores, it is not important who is unblocking the locked resources.</p>
<p>The last thing to mention is that mutex is generally easier to manage, because it is always about binary state. On the other hand, the more threads we allow semaphore to have the more complexity we introduce into a system.</p>
<p>Since mutex and semaphore share a similar goal - to limit access to the shared resources, their implementation would be quite similar except for some details that we just mentioned.</p>
<p>We'll start with the constructor.</p>
<pre><code class="lang-javascript"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Mutex</span> </span>{

  <span class="hljs-keyword">constructor</span>(buffer) {
    <span class="hljs-built_in">this</span>.typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int32Array</span>(buffer);
    <span class="hljs-built_in">this</span>.isOwner = <span class="hljs-literal">false</span>;
  }
}
</code></pre>
<p>We're not passing the <code>maxCount</code> argument because mutex is always binary. We're also adding a new property <code>isOwner</code> to keep track of the mutex ownership.</p>
<p>The next step is to implement the <code>lock</code> method.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">const</span> unlocked = <span class="hljs-number">0</span>;
<span class="hljs-keyword">const</span> locked = <span class="hljs-number">1</span>;

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Mutex</span> </span>{

  <span class="hljs-keyword">constructor</span>(buffer) {
    <span class="hljs-built_in">this</span>.typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int32Array</span>(buffer);
    <span class="hljs-built_in">this</span>.isOwner = <span class="hljs-literal">false</span>;
  }

  lock() {
    <span class="hljs-keyword">while</span> (<span class="hljs-literal">true</span>) {
      <span class="hljs-keyword">const</span> value = Atomics.load(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>);
      <span class="hljs-keyword">if</span> (value === locked) {
        Atomics.wait(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, locked);
        <span class="hljs-keyword">continue</span>;
      }
      <span class="hljs-keyword">if</span> (Atomics.compareExchange(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, unlocked, locked) === unlocked) {
        <span class="hljs-built_in">this</span>.isOwner = <span class="hljs-literal">true</span>;
        <span class="hljs-keyword">return</span>;
      }
    }
  }
}
</code></pre>
<p>It is pretty similar to the semaphore's implementation of <code>acquire</code> method. Except we're setting the <code>isOwner</code> property to <code>true</code> and implementing our logic in a way where only 2 possible states are allowed: <code>locked</code> and <code>unlocked</code>.</p>
<p>The final step is to implement the <code>unlock</code> method.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">const</span> unlocked = <span class="hljs-number">0</span>;
<span class="hljs-keyword">const</span> locked = <span class="hljs-number">1</span>;

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Mutex</span> </span>{

  <span class="hljs-keyword">constructor</span>(buffer) {
    <span class="hljs-built_in">this</span>.typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int32Array</span>(buffer);
    <span class="hljs-built_in">this</span>.isOwner = <span class="hljs-literal">false</span>;
  }

  lock() {
    <span class="hljs-keyword">while</span> (<span class="hljs-literal">true</span>) {
      <span class="hljs-keyword">const</span> value = Atomics.load(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>);
      <span class="hljs-keyword">if</span> (value === locked) {
        Atomics.wait(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, locked);
        <span class="hljs-keyword">continue</span>;
      }
      <span class="hljs-keyword">if</span> (Atomics.compareExchange(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, unlocked, locked) === unlocked) {
        <span class="hljs-built_in">this</span>.isOwner = <span class="hljs-literal">true</span>;
        <span class="hljs-keyword">return</span>;
      }
    }
  }

  unlock() {
    <span class="hljs-keyword">if</span> (!<span class="hljs-built_in">this</span>.isOwner) {
      <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'Thread that tries to unlock the mutex is not the owner of the mutex'</span>);
    }
    Atomics.store(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, unlocked);
    Atomics.notify(<span class="hljs-built_in">this</span>.typedArray, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>);
    <span class="hljs-built_in">this</span>.isOwner = <span class="hljs-literal">false</span>;
  }
}
</code></pre>
<p>The crucial part here is to check whether the current thread is the owner of the mutex before unlocking it. We're throwing an error if that's not the case, because thread is not allowed to unlock the mutex it doesn't own.</p>
<p>Here is how you use it:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { isMainThread, Worker, workerData, threadId } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:worker_threads'</span>;

<span class="hljs-keyword">if</span> (isMainThread) {
  <span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> SharedArrayBuffer(<span class="hljs-number">10</span>);
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename, { <span class="hljs-attr">workerData</span>: buffer });
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename, { <span class="hljs-attr">workerData</span>: buffer });
} <span class="hljs-keyword">else</span> {
  <span class="hljs-keyword">const</span> typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int8Array</span>(workerData);
  <span class="hljs-keyword">const</span> mutex = <span class="hljs-keyword">new</span> Mutex(workerData);

  <span class="hljs-comment">// Critical section start</span>
  mutex.lock();
  typedArray[<span class="hljs-number">4</span>] = threadId;
  <span class="hljs-built_in">console</span>.dir({ threadId, <span class="hljs-attr">value</span>: typedArray[<span class="hljs-number">4</span>] });
  mutex.unlock();
  <span class="hljs-comment">// End of critical section</span>
}
</code></pre>
<p>Notice that we're not creating a mutex inside of the main thread as we did with semaphore.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>We've covered a lot in this article:</p>
<ul>
<li><p>What is the critical section, and how does it look like Node.js</p>
</li>
<li><p>How other programming languages work with critical sections and shared resources across multiple threads</p>
</li>
<li><p>Implemented custom <code>Semaphore</code> and <code>Mutex</code> classes in Node.js to abstract from low-level <code>Atomics</code> API work</p>
</li>
</ul>
<p>While things like <code>Semaphore</code> and <code>Mutex</code> definitely makes our lives easier, but they also create new problems like deadlocks and livelocks.</p>
<p>In the upcoming article, we'll see how we can run into common multithreading problems in Node.js and how to solve them.</p>
]]></content:encoded></item><item><title><![CDATA[Multithreading in Node.js: Using Atomics for Safe Shared Memory Operations]]></title><description><![CDATA[Node.js developers got too comfortable with a single thread where JavaScript is executed. Even with the introduction of multiple threads via worker_threads, you can feel pretty safe.
However, things change when you add shared resources to multiple th...]]></description><link>https://pavel-romanov.com/multithreading-in-nodejs-using-atomics-for-safe-shared-memory-operations</link><guid isPermaLink="true">https://pavel-romanov.com/multithreading-in-nodejs-using-atomics-for-safe-shared-memory-operations</guid><category><![CDATA[Node.js]]></category><category><![CDATA[JavaScript]]></category><category><![CDATA[Web Development]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Wed, 28 Aug 2024 15:22:18 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1724857389715/897b6b43-cc5c-4d02-8844-73fea9b5211b.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Node.js developers got too comfortable with a single thread where JavaScript is executed. Even with the introduction of multiple threads via <code>worker_threads</code>, you can feel pretty safe.</p>
<p>However, things change when you add shared resources to multiple threads. In fact, it is one of the most challenging topics in all software engineering. I'm talking about multithreading programming.</p>
<p>Thankfully, JavaScript provides a built-in abstraction to mitigate the problem of shared resources across multiple threads. This mechanism is called <a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Atomics">Atomics</a>.</p>
<p>In this article, you'll learn what shared resources look like in Node.js and how <code>Atomics</code> API helps us to prevent wild race conditions.</p>
<h2 id="heading-shared-memory-between-multiple-threads">Shared memory between multiple threads</h2>
<p>Let's start with understanding what transferable objects are.</p>
<p>Transferable objects are the objects that can be transferred from one execution context to another without holding to resources from the original context.</p>
<p>An execution context is a place where JavaScript code can be executed. To make it easier to understand, let's assume that an execution context is equal to a worker thread because each thread is indeed a separate execution context.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1724857531837/ec0d2a29-e914-4fe3-9281-e718c24b548e.jpeg" alt class="image--center mx-auto" /></p>
<p>For example, <code>ArrayBuffer</code> is a transferable object. It consists of 2 parts: raw allocated memory and JavaScript handle to this memory. You can read the article about <a target="_blank" href="https://pavel-romanov.com/javascript-buffers-explained-why-they-matter-and-how-to-use-them">Buffers in JavaScript</a> to learn more about this topic.</p>
<p>Whenever we transfer <code>ArrayBuffer</code> from the main thread to a worker thread, both components, the raw memory and JavaScript objects are recreated in the worker thread. There is no way you can access the same object reference or underlying memory of <code>ArrayBuffer</code> inside of the worker thread.</p>
<p>The only way to share resources between different threads is to use <code>SharedArrayBuffer</code>.</p>
<p>As the name suggests, it is designed to be shared. We consider this buffer to be a non-transferable object. If you try to pass <code>SharedArrayBuffer</code> from the main thread to a worker thread, only the JavaScript object gets recreated, but the memory region that it refers to is the same</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1724857675140/5551a60e-5325-4021-acbd-be8ac9e10a9e.jpeg" alt class="image--center mx-auto" /></p>
<p>While <code>SharedArrayBuffer</code> is a unique and powerful API it comes with a cost.</p>
<p>As Uncle Ben told us:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1724857704372/30f7a382-c985-47a7-a1a3-1acc1c45b669.jpeg" alt class="image--center mx-auto" /></p>
<p>When we share resources between multiple threads, we expose ourselves to a whole new world of nasty race conditions.</p>
<h2 id="heading-race-conditions-for-shared-resources">Race conditions for shared resources</h2>
<p>It would be easier to understand what I'm talking about with a particular example.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { Worker, isMainThread } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:worker_threads'</span>;

<span class="hljs-keyword">if</span> (isMainThread) {
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename);
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename);
} <span class="hljs-keyword">else</span> {
  <span class="hljs-comment">// worker code</span>
}
</code></pre>
<p>We're using the same file to run the main thread and worker threads. The block under <code>isMainThread</code> condition is executed only for the main thread. You might also notice <code>import.meta.filename</code>, it is ES6 alternative to <code>__filename</code> variable available since Node 20.11.0. Next, we introduce a shared resource and an operation over the shared resource.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { Worker, isMainThread, workerData, threadId } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:worker_threads'</span>;

<span class="hljs-keyword">if</span> (isMainThread) {
  <span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> SharedArrayBuffer(<span class="hljs-number">1</span>);
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename, { <span class="hljs-attr">workerData</span>: buffer });
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename, { <span class="hljs-attr">workerData</span>: buffer });
} <span class="hljs-keyword">else</span> {
  <span class="hljs-keyword">const</span> typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int8Array</span>(workerData);
  typedArray[<span class="hljs-number">0</span>] = threadId;
  <span class="hljs-built_in">console</span>.dir({ threadId, <span class="hljs-attr">value</span>: typedArray[<span class="hljs-number">0</span>] });
}
</code></pre>
<p>We pass <code>SharedArrayBuffer</code> to each of the workers as <code>workerData</code>. Both workers change the first element of the buffer to its ID. Then, we log the first buffer element.</p>
<p>One of the workers will have ID equals to <code>1</code> and the other one to <code>2</code>. Without reading any further, what are you expecting to see in the output when this code runs?</p>
<p>Here is the result.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># 1 type of results</span>
{ threadId: 1, value: 2 }
{ threadId: 2: value: 2 }

<span class="hljs-comment"># 2 type of results</span>
{ threadId: 1, value: 1 }
{ threadId: 2: value: 1 }

<span class="hljs-comment"># 3 type of results</span>
{ threadId: 1, value: 1 }
{ threadId: 2: value: 2 }
</code></pre>
<p>Did you notice it? Why on earth do we have cases where the value is the same for both threads? If you think about it from the standpoint of a single-threaded program, we should see <strong>different</strong> values printed every time.</p>
<p>Even if we run this code asynchronously in a single thread, the only thing that could be possibly different is the order in which a result is printed, but not such a drastic difference in the final value.</p>
<p>What happens here is one of the threads assigns value right between these two lines:</p>
<pre><code class="lang-javascript">  typedArray[<span class="hljs-number">0</span>] = threadId;

  <span class="hljs-comment">// one of the threads sneaks right in here and assign value</span>

  <span class="hljs-built_in">console</span>.dir({ threadId, <span class="hljs-attr">value</span>: typedArray[<span class="hljs-number">0</span>] });
</code></pre>
<p>It goes like this:</p>
<ol>
<li><p>The First thread assigns a value to the shared buffer</p>
</li>
<li><p>The second thread assigns a value to the shared buffer</p>
</li>
<li><p>The first thread prints the result to the console</p>
</li>
<li><p>The second thread prints the result to the console.</p>
</li>
</ol>
<p>As you can see, it is easy to run into a race condition with as little as 10 lines of code when we have shared resources and multiple threads. That's why we need a mechanism that can make sure that one worker is not interrupting the workflow of another worker. The <code>Atomics</code> API was created exactly for this purpose.</p>
<h2 id="heading-atomics">Atomics</h2>
<p>I want to emphasize that using <code>Atomics</code> is the <strong>only possible way</strong> to be 100% sure that you're not running into race conditions when dealing with multiple threads and shared resources between them.</p>
<p>The main purpose of <code>Atomics</code> is to make sure that a single operation is performed as a single, uninterruptible unit. In other words, it ensures that no other workers can get in the middle of currently executable operation and do their stuff, like we've seen before.</p>
<p>Let's rewrite the example with race conditions using <code>Atomics</code>.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { Worker, isMainThread, workerData, threadId } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:worker_threads'</span>;

<span class="hljs-keyword">if</span> (isMainThread) {
  <span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> SharedArrayBuffer(<span class="hljs-number">1</span>);
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename, { <span class="hljs-attr">workerData</span>: buffer });
  <span class="hljs-keyword">new</span> Worker(<span class="hljs-keyword">import</span>.meta.filename, { <span class="hljs-attr">workerData</span>: buffer });
} <span class="hljs-keyword">else</span> {
  <span class="hljs-keyword">const</span> typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int8Array</span>(workerData);
  <span class="hljs-keyword">const</span> value = Atomics.store(typedArray, <span class="hljs-number">0</span>, threadId);
  <span class="hljs-built_in">console</span>.dir({ threadId, value });
}
</code></pre>
<p>We changed two things: how we save the value and how we read the saved value. Using <code>Atomics</code>, we can do both operations at the same time using the <code>store</code> function.</p>
<p>When you run this code, you won't see a case where both threads have the same value. They are always different.</p>
<pre><code class="lang-bash">[1, 1]
[2, 2]

[2, 2]
[1, 1]
</code></pre>
<p>We could use 2 operations instead of 1: <code>store</code> and <code>load</code>.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">const</span> typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int8Array</span>(workerData);
Atomics.store(typedArray, <span class="hljs-number">0</span>, threadId);
<span class="hljs-keyword">const</span> value = Atomics.load(typedArray, <span class="hljs-number">0</span>);
<span class="hljs-built_in">console</span>.dir({ threadId, value });
</code></pre>
<p>However, this approach is still prone to race conditions. The whole point of using <code>Atomics</code> is to make our operations <em>atomic</em>.</p>
<p>In this case, we want 2 operations to be executed as a single atomic operation: to save a value and to read this value. When we use <code>store</code> and <code>load</code> functions, we're actually doing 2 separate atomics operations, not 1.</p>
<p>That's why it is still possible to run into a race condition where code from one worker gets in between <code>store</code> and <code>load</code> calls from the other threads.</p>
<p>There are more than just 2 functions to <code>Atomics</code>, in the following article, we'll cover <a target="_blank" href="https://pavel-romanov.com/building-semaphore-and-mutex-in-nodejs">how to use more of its functions to build our own semaphore and mutex</a> to make the work with shared resources even more convenient.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Node.js is all fun and good when there is only a single thread. If you introduce multiple threads and shared resources on top of it, you get an environment where race conditions are inevitable.</p>
<p>There is only one mechanism in JavaScript that allows you to mitigate these problems and avoid race conditions, it is called <code>Atomics</code>.</p>
<p>The idea of <code>Atomics</code> is to have operations that execute as a single unit that cannot be interrupted from the outside.</p>
<p>Thanks to such a design, we can be sure that whenever we use <code>Atomics</code> functions, there is no way for other threads to get somewhere inside of such operations.</p>
]]></content:encoded></item><item><title><![CDATA[Docker Desktop Free Alternatives for Mac and Windows]]></title><description><![CDATA[There are different reasons why people can't use Docker Desktop. It might be restricted by company policies or because it requires you to pay at some point in time (most likely the latter one).
Don't get me wrong, I'm not against paying for products ...]]></description><link>https://pavel-romanov.com/docker-desktop-free-alternatives-for-mac-and-windows</link><guid isPermaLink="true">https://pavel-romanov.com/docker-desktop-free-alternatives-for-mac-and-windows</guid><category><![CDATA[Node.js]]></category><category><![CDATA[Docker]]></category><category><![CDATA[podman]]></category><category><![CDATA[Web Development]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Wed, 21 Aug 2024 11:33:07 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1724236258922/66c517ea-29ab-45a1-a2b7-3b33086190e5.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There are different reasons why people can't use Docker Desktop. It might be restricted by company policies or because it requires you to pay at some point in time (most likely the latter one).</p>
<p>Don't get me wrong, I'm not against paying for products and services, but sometimes you don't make the decision on whether to pay or not.</p>
<p>People using Linux are fine. However, it is not that easy for people using Mac and Windows to drop Docker Desktop. That's exactly the situation where I found myself recently. This article will show you what alternatives you might use to keep working with Docker on Mac and Windows without using Docker Desktop.</p>
<h2 id="heading-docker-desktop-is-more-than-just-ui">Docker Desktop is more than just UI</h2>
<p>First, I want to address a common misconception.</p>
<p>Even if you're not using UI, you might very well be using Docker Desktop. You see, the desktop bundle is more than just UI, and you start using it from the moment you accept the license agreement.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1724236494412/fe62674a-eeaf-4a04-a078-73a247866073.png" alt="An example of how Docker Desktop license agreement looks like" class="image--center mx-auto" /></p>
<p>The catch is that without accepting the agreement, you won't be able to work with Docker at all. You still have CLI, which comes with the desktop bundle, but to make things work, you have to have the Docker daemon running.</p>
<p>If you try to run any command that somehow interacts with the daemon, you'll get the following error.</p>
<pre><code class="lang-bash">Cannot connect to the Docker daemon at unix://path/top/docker/socket.
Is the Docker daemon running?
</code></pre>
<p>If you are running Docker containers without installing any extra tools and don't see such an error, it means you've accepted the service agreement. You're using Docker Desktop, even if you're not touching the UI.</p>
<h2 id="heading-how-does-docker-desktop-work">How does Docker Desktop work</h2>
<p>Before diving into Docker Desktop alternatives, let's first understand what Docker Desktop is and how it works.</p>
<p>On the official <a target="_blank" href="https://docs.docker.com/desktop/">Docker website</a>, you can find every piece of software that comes with the desktop bundle.</p>
<p>In this article, we'll mostly focus on <a target="_blank" href="https://docs.docker.com/engine/">Docker Engine</a>.</p>
<p>Docker Engine is the heart of Docker. It keeps the system up and running. The engine consists of 3 pieces:</p>
<ul>
<li><p>A long-running daemon process called <code>dockerd</code></p>
</li>
<li><p>API that other programs can use to interact with programs running in <code>dockerd</code></p>
</li>
<li><p>A command line interface (CLI) called <code>docker</code></p>
</li>
</ul>
<p>Notice that <a target="_blank" href="https://docs.docker.com/engine/install/#supported-platforms">Docker Engine only supports particular Linux distributions</a>. But how does it work on Mac and Windows then?</p>
<p>To make things work, Docker Desktop creates a virtual machine (VM) on your Mac or Windows machine. This VM is running a Linux distribution that Docker Engine supports. Inside this VM, Docker Engine runs the <code>dockerd</code> daemon.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1724236848707/b121ef0a-27e2-487c-81df-2c91fabcc06d.jpeg" alt="Workflow of how Docker Desktop make containers running on Mac and Windows machines" class="image--center mx-auto" /></p>
<p>It is important to understand the part on how things get running with Docker Desktop because all the alternatives that we're about to look at are doing exactly the same.</p>
<p>They all create a VM and run things inside this VM. It is the universal approach to make Docker work on unsupported Linux distributions or non-Linux systems.</p>
<p>Now that you have some idea of how Docker Desktop works behind the scene, we can move forward to its alternatives.</p>
<h2 id="heading-colima-for-mac">Colima for Mac</h2>
<p>The first alternative on the list is <a target="_blank" href="https://github.com/abiosoft/colima">Colima</a>. It is a minimalistic container runtime for Mac and Linux. To install Colima on your Mac, run the following brew command.</p>
<pre><code class="lang-bash">brew install colima
</code></pre>
<p>To start running Colima, use the start command.</p>
<pre><code class="lang-bash">colima start
</code></pre>
<p>This command creates a Linux VM and makes it possible to run the <code>dockerd</code> daemon. The default configuration of the VM is as following:</p>
<ul>
<li><p>Disk space 60GB</p>
</li>
<li><p>CPU 2</p>
</li>
<li><p>Memory 2</p>
</li>
</ul>
<p>If you want to change any of these, pass a dedicated option with the number that suites your needs.</p>
<pre><code class="lang-bash">colima start --disk 100 --cpu 4 --memory 6
</code></pre>
<p>This command creates a machine with 100GB of disk space, 4 CPU cores, and 6GB of RAM.</p>
<p>It was the hardest part. Now, you only need to install the Docker CLI and Docker credentials helper packages using brew.</p>
<pre><code class="lang-bash">brew install docker docker-credentials-helper
</code></pre>
<p>Now you're ready to go.</p>
<h2 id="heading-podman-for-mac-and-windows">Podman for Mac and Windows</h2>
<p>Podman is <a target="_blank" href="https://www.redhat.com/en/topics/containers/what-is-podman">developed by Red Hat</a> and the open source community.</p>
<p>Podman is different from Colima:</p>
<ul>
<li><p>It works across all major operating systems: Windows, Mac, Linux</p>
</li>
<li><p>It is not only about containers. It goes beyond and provides a way to work with Kubernetes</p>
</li>
<li><p>Podman itself implements the Open Container Initiative (<a target="_blank" href="https://opencontainers.org/">OCI</a>)</p>
</li>
</ul>
<p>At the same time, Podman is similar to Colima in terms of how it makes things work. When running Podman on Windows or Mac, it <a target="_blank" href="https://podman.io/docs/installation#installing-on-mac--windows">creates VM</a> that allows it to run <code>dockerd</code> inside.</p>
<p>If you like to work with GUI (I personally do), then you can install a dedicated <a target="_blank" href="https://podman-desktop.io/">desktop client</a>. It is super convenient to <a target="_blank" href="https://podman-desktop.io/docs/podman/creating-a-podman-machine">create VM</a> using it.</p>
<p>Since it implements OPC itself, is it possible to use it with Docker? The answer is sound yes. Podman is <a target="_blank" href="https://podman.io/docs/installation#installing-on-mac--windows">listening</a> to Docker API clients supporting direct usage of Docker-based tools.</p>
<p>Overall, Podman looks like a solid alternative to Docker. It is backed by the big company and an open-source community. It has a dedicated desktop client that is easy to work with. It implements OPC, which makes it possible to ignore Docker completely and move to Podman.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Some people think that if they are not using the UI provided by the desktop app, they are not using Docker Desktop, but this is far from true. The moment you accept the license agreement, you're using the desktop application.</p>
<p>There might be different reasons why you can't use Docker Desktop, but it doesn't mean you have no options. There are great tools that can replace Docker Desktop without any problems and make your experience with Docker seamless.</p>
<p>For Mac users, Colima is one option. It is lightweight, easy to install, and impressively easy to manage.</p>
<p>For Mac and Windows users, Podman is another great option. It is not only compatible with Docker and its API, but unlike Colima, it also provides a dedicated desktop client with GUI where you can manage images, containers, and everything that comes with it.</p>
]]></content:encoded></item><item><title><![CDATA[Understanding Node.js Buffer]]></title><description><![CDATA[So far, we've become familiar with buffers, typed arrays, data views, and how they all work together. If you missed the previous articles, I highly recommend reading the one dedicated to buffers and the other one on views.
Node.js provides a dedicate...]]></description><link>https://pavel-romanov.com/understanding-nodejs-buffer</link><guid isPermaLink="true">https://pavel-romanov.com/understanding-nodejs-buffer</guid><category><![CDATA[Node.js]]></category><category><![CDATA[JavaScript]]></category><category><![CDATA[Web Development]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Thu, 15 Aug 2024 15:59:15 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1723737520459/26b5a229-7cb1-42f7-a503-61de0e696ae9.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>So far, we've become familiar with buffers, typed arrays, data views, and how they all work together. If you missed the previous articles, I highly recommend reading the one dedicated to <a target="_blank" href="https://pavel-romanov.com/javascript-buffers-explained-why-they-matter-and-how-to-use-them">buffers</a> and the other one on <a target="_blank" href="https://pavel-romanov.com/uint8array-vs-dataview-choosing-the-right-buffer-view-in-javascript">views</a>.</p>
<p>Node.js provides a dedicated buffer abstraction called <code>Buffer</code>. Why do we need more buffers when we already have <code>ArrayBuffer</code> and the different views that come with it?</p>
<p>In this article, we'll answer this question and understand the difference between Node.js buffer and all the others. In the end, you'll learn what problems the Node.js buffer has and why some people avoid using it at all costs.</p>
<h2 id="heading-difference-between-nodejs-buffer-and-typed-arrays">Difference between Node.js buffer and typed arrays</h2>
<p>In short, the Node.js buffer is basically <code>Uint8Array</code> spiced up with some extra logic. It automatically means two things:</p>
<ol>
<li><p>Node.js buffer is not "actually" a buffer but a view into an underlying buffer.</p>
</li>
<li><p>Wherever you use <code>Uint8Array</code>, you can also use Node.js buffer, with minor exceptions that we'll discuss later in this article.</p>
</li>
</ol>
<p>One of the core abstractions responsible for buffers in Node.js is called <code>FastBuffer</code>. It is a class that extends <code>Uint8Array</code>. You can <a target="_blank" href="https://github.com/nodejs/node/blob/d9554978740a75d7150d9b58d232a1de6b88f93c/lib/internal/buffer.js#L956">see it</a> for yourself.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">class</span> FastBuffer <span class="hljs-keyword">extends</span> <span class="hljs-built_in">Uint8Array</span> {
  <span class="hljs-comment">// Using an explicit constructor here is necessary to avoid relying on</span>
  <span class="hljs-comment">// `Array.prototype[Symbol.iterator]`, which can be mutated by users.</span>
  <span class="hljs-comment">// eslint-disable-next-line no-useless-constructor</span>
  <span class="hljs-keyword">constructor</span>(<span class="hljs-params">bufferOrLength, byteOffset, length</span>) {
    <span class="hljs-built_in">super</span>(bufferOrLength, byteOffset, length);
  }
}
</code></pre>
<p>Functions that create Node.js buffer such as <code>Buffer.from</code> always <a target="_blank" href="https://github.com/nodejs/node/blob/d9554978740a75d7150d9b58d232a1de6b88f93c/lib/buffer.js#L476">return</a> the <code>FastBuffer</code>. But what is the point of having such a dumb class without any logic? Moreover, whenever we call <code>Buffer.from</code> it actually returns <code>Buffer</code>, not <code>FastBuffer</code>.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Buffer } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:buffer'</span>;

<span class="hljs-keyword">const</span> buffer = Buffer.from(<span class="hljs-string">'hello'</span>);
<span class="hljs-built_in">console</span>.log (buffer); <span class="hljs-comment">// Prints &lt;Buffer 68 65 6c 6c 6f&gt;</span>
</code></pre>
<p>Am I misleading you? Not really. The reason you see <code>Buffer</code> instead of <code>FastBuffer</code> in the console and why <code>FastBuffer</code> doesn't contain any additional logic itself because of the <a target="_blank" href="https://github.com/nodejs/node/blob/01cf9bccdfa7fb31a8a1d91ae45e594a730e0427/lib/buffer.js#L130">prototype manipulations</a>.</p>
<pre><code class="lang-typescript">FastBuffer.prototype.constructor = Buffer;
Buffer.prototype = FastBuffer.prototype;
addBufferPrototypeMethods(Buffer.prototype);
</code></pre>
<p>What is the point of such a manipulation? One of the reasons is backward compatibility. Many Node.js APIs have been using <code>Buffer</code> for a long time, and changing code to <code>FastBuffer</code> might result in big problems. That's why it is easier just to swap prototypes and keep the same interfaces for the existing code.</p>
<p>The <code>Buffer</code> prototype is where all methods like <code>from</code>, <code>alloc</code>, and others reside.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1723736378794/8525e8e1-cef9-48a0-807c-cf9b985857b6.jpeg" alt class="image--center mx-auto" /></p>
<h2 id="heading-nodejs-buffer-memory-allocation">Node.js buffer memory allocation</h2>
<p>I want to bring your attention to how Node.js allocates the memory for its buffer.</p>
<p>That's one of the most important differences between <code>Buffer</code> and <code>Uint8Array</code>, which you should be aware of.</p>
<p>The first thing you have to understand is that typed arrays have separate memory pools.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> array1 = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Uint8Array</span>(<span class="hljs-string">'hello'</span>);
<span class="hljs-keyword">const</span> array2 = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Uint8Array</span>(<span class="hljs-string">'world'</span>);

<span class="hljs-built_in">console</span>.log(array1.byteOffset); <span class="hljs-comment">// Prints 0</span>
<span class="hljs-built_in">console</span>.log(array2.byteOffset); <span class="hljs-comment">// Prints 0</span>
</code></pre>
<p>By memory pool, I mean the buffer where data is stored. Every time you create a new typed array, you create a dedicated instance of <code>ArrayBuffer</code> with it. At the same time, Node.js buffers <strong>share</strong> the same memory pool.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Buffer } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:buffer'</span>;

<span class="hljs-keyword">const</span> buffer1 = Buffer.from(<span class="hljs-string">'hello'</span>);
<span class="hljs-keyword">const</span> buffer2 = Buffer.from(<span class="hljs-string">'world'</span>);

<span class="hljs-built_in">console</span>.log(buffer1.byteOffset); <span class="hljs-comment">// Prints 16</span>
<span class="hljs-built_in">console</span>.log(buffer2.byteOffset); <span class="hljs-comment">// Prints 24</span>
</code></pre>
<p>If you save anything less than 8 kilobytes in a buffer, then you end up with multiple data chunks in the same shared memory pool.</p>
<p>You see <code>16</code> in the console because there was already some pre-allocated data inside of a buffer pool. The bytes offset of the second buffer directly depends on how much space the first buffer takes.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Buffer } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:buffer'</span>;

<span class="hljs-keyword">const</span> buffer1 = Buffer.from(<span class="hljs-string">'controversial'</span>);
<span class="hljs-keyword">const</span> buffer2 = Buffer.from(<span class="hljs-string">'opinion'</span>);

<span class="hljs-built_in">console</span>.log(buffer1.byteOffset); <span class="hljs-comment">// Prints 16</span>
<span class="hljs-built_in">console</span>.log(buffer2.byteOffset); <span class="hljs-comment">// Prints 32</span>
</code></pre>
<p>The word <em>hello</em> is 5 characters long, and the word <em>controversial</em> is 13 characters long. The difference between them is 8 characters, and it is the exact difference in the offset of a second buffer between two examples.</p>
<p>I must say that there is a safer way to create a buffer. You can do so by using <code>alloc</code> function. However, it doesn't change the fact that the buffer has a shared memory pool.</p>
<h2 id="heading-nodejs-buffer-problems">Node.js Buffer problems</h2>
<p>Now, that you understand how Node.js <code>Buffer</code> looks like it is time to discuss some of its problems. It's been an ongoing discussion since typed arrays became available. The main question is, "Why do we need to have the <code>Buffer</code> at all?"</p>
<h3 id="heading-it-is-not-compatible-with-other-platforms">It is not compatible with other platforms</h3>
<p>Imagine you developing a library that you want to be available anywhere: browsers, Node.js, or Deno. If you try to write code using <code>Buffer</code> it will break in any other non-Node.js platform. The reason is simple, it is Node.js specific API.</p>
<p>Instead, you can use typed arrays like <code>Uint8Array</code> or <code>Int8Array</code>. They are <a target="_blank" href="https://tc39.es/ecma262/multipage/indexed-collections.html#table-49">part</a> of ECMAScript standard, and all big platforms that are running JavaScript support them.</p>
<h3 id="heading-it-violates-liskov-substitution-principle">It violates Liskov substitution principle</h3>
<p>Liskov substitution principle (LSP) is one of the SOLID principles that state that a subclass should be able to substitute for its parent class without causing unexpected behaviors.</p>
<p>While Node.js <code>Buffer</code> extends <code>Uint8Array</code>, it overrides <code>slice</code> method, which leads to violation of LSP because they behave differently.</p>
<p>When you use <code>slice</code> with typed arrays like <code>Uint8Array</code> instances, a new typed array with a new underlying buffer is created.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> array1 = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Uint8Array</span>([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>]);
<span class="hljs-keyword">const</span> array2 = array1.slice(<span class="hljs-number">0</span>, <span class="hljs-number">2</span>);

array2[<span class="hljs-number">0</span>] = <span class="hljs-number">4</span>;

<span class="hljs-comment">// Prints Uint8Array {0: 4, 1: 2}</span>
<span class="hljs-built_in">console</span>.log(array2);

<span class="hljs-comment">// Prints Uint8Array {0: 1, 1: 2, 2: 3}</span>
<span class="hljs-built_in">console</span>.log(array1);
</code></pre>
<p>After we mutate the first element of <code>array2</code>, it only affects the underlying buffer of <code>array2</code>. We don't see any changes to the first element in <code>array1</code>. <code>slice</code> method of Node.js Buffer is different. Instead of creating a copy, it creates a view into the same buffer.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Buffer } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:buffer'</span>;

<span class="hljs-keyword">const</span> buffer1 = Buffer.from([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>]);
<span class="hljs-keyword">const</span> buffer2 = buffer1.slice(<span class="hljs-number">0</span>, <span class="hljs-number">2</span>);

buffer2[<span class="hljs-number">0</span>] = <span class="hljs-number">4</span>;

<span class="hljs-comment">// Prints &lt;Buffer 04 02&gt;</span>
<span class="hljs-built_in">console</span>.log(buffer2);

<span class="hljs-comment">// Prints &lt;Buffer 04 02 03&gt;</span>
<span class="hljs-built_in">console</span>.log(buffer1);
</code></pre>
<p>You can see that <code>buffer2</code> has 4 as its first element, which is expected because we directly mutated this buffer. What is not expected is that the first element of <code>buffer1</code> has also changed to 4. <code>slice</code> method is <a target="_blank" href="https://nodejs.org/api/buffer.html#bufslicestart-end">officially marked</a> as deprecated. However, as long as it is available as <code>Buffer</code> method it violates LSP.</p>
<h3 id="heading-buffer-has-more-security-implications">Buffer has more security implications</h3>
<p>The last but not least part is <code>Buffer</code> security. As we've mentioned before, Node.js buffer has a shared memory pool.</p>
<p>If we have some data stored in this shared memory pool, any code that runs in your application can access this memory. Imagine a scenario where you store sensitive data in the buffer, like a person's address, and I, as a bad actor, can access this data.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Buffer } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:buffer'</span>;

<span class="hljs-keyword">const</span> addressBuffer = Buffer.from(<span class="hljs-string">'Person personal addres'</span>);
</code></pre>
<p>Here we have a buffer with personal information that we don't want to leak anywhere. We can gain access to it through the shared memory pool.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> hackyBuffer = Buffer.from(<span class="hljs-string">'1'</span>);

<span class="hljs-comment">// Prints the full underlying buffer</span>
<span class="hljs-built_in">console</span>.log(hackyBuffer.buffer);
</code></pre>
<p>Assuming we don't have access to the initial buffer object itself, we create <code>hackyBuffer</code>. The underlying buffer of <code>hackyBuffer</code> contains the persons' private data. The only thing left is to interpret the hexadecimal data stored in the buffer into a human-readable format.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> hackyBuffer = Buffer.from(<span class="hljs-string">'1'</span>);

<span class="hljs-keyword">const</span> typedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Uint8Array</span>(hackyBuffer.buffer)
  .filter(<span class="hljs-function"><span class="hljs-params">value</span> =&gt;</span> value !== <span class="hljs-number">0</span>);
<span class="hljs-keyword">const</span> charCodes = <span class="hljs-built_in">Array</span>.from(typedArray);

<span class="hljs-comment">// Prints "//Person personal addres"</span>
<span class="hljs-built_in">console</span>.log(<span class="hljs-built_in">String</span>.fromCodePoint(...charCodes));
</code></pre>
<p>We created a view into the underlying buffer to be able to work with its content. Then, we filtered all empty values and created an array of character codes. At this point, it was automatically converted from a hexadecimal to a decimal numeric system.</p>
<p>The only thing left is to convert those numbers into a string. Since all of these numbers are basically a number representation of Unicode characters, we can use <code>fromCodePoint</code> to convert them back to characters.</p>
<p>If you're not comfortable with all these manipulations with character and numeric systems, I highly recommend reading the <a target="_blank" href="https://pavel-romanov.com/from-ascii-to-unicode-a-javascript-developers-guide-to-text-encoding">article</a> about different encoding schemes in JavaScript.</p>
<p>And that's how you "hack" Node.js buffer.</p>
<h2 id="heading-why-do-we-need-to-use-nodejs-buffer-at-all">Why do we need to use Node.js buffer at all</h2>
<p>After seeing all of these problems, it is reasonable to ask the purpose of using of Node.js buffer.</p>
<p>It is a valid question, and it has been raised in Node.js community <a target="_blank" href="https://github.com/nodejs/node/issues/41588">before</a>. We can notice that the question of giving more preference to <code>Uint8Array</code> instead of <code>Buffer</code> having more and more attention which is a good sign. Despite all the problems, <code>Buffer</code> has useful functions that haven't been shipped with typed arrays yet. For example, we can convert buffer to <code>hex</code> or <code>base64</code> string.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Buffer } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:buffer'</span>;

<span class="hljs-keyword">const</span> buffer = Buffer.from(<span class="hljs-string">'hello'</span>);

<span class="hljs-comment">// Prints 68656c6c6f</span>
<span class="hljs-built_in">console</span>.log(buffer.toString(<span class="hljs-string">'hex'</span>));
</code></pre>
<p>There are libraries that make it work with <code>Uint8Array</code>, but it is one more dependency in your project that is not always worth it.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p><code>Buffer</code> is Node.js abstraction that enables you to create and manage buffers through a common interface. The <code>Buffer</code> extends <code>Uint8Array</code> and essentially not a buffer, but view into the underlying <code>ArrayBuffer</code>.</p>
<p>While <code>Buffer</code> extends <code>Uint8Array</code>, it provides extra methods on top of the typed array, which makes it unique.</p>
<p>Unlike regular typed arrays, Node.js buffer has a shared memory pool. It means that if you create two small buffers, they share the same memory space. This design with a shared memory pool leads to potential vulnerabilities because it is possible to get data from one buffer through a completely unrelated buffer.</p>
<p>Other problems of <code>Buffer</code> include limited distribution across other platforms because it is Node.js-specific API, and violation of LSP principle from SOLID where superclass method behavior and subclass method behavior are different.</p>
<p>Despite all of these problems, the Node.js <code>Buffer</code> still might be useful because it provides the functionality that typed arrays simply don't have.</p>
]]></content:encoded></item><item><title><![CDATA[Uint8Array vs DataView: Choosing the Right Buffer View in JavaScript]]></title><description><![CDATA[In the previous article, we got familiar with JavaScript buffers. The lowest possible implementation of a buffer in JavaScript is the ArrayBuffer class. This class is read-only, meaning we don't have any API to write data inside the buffer.
To change...]]></description><link>https://pavel-romanov.com/uint8array-vs-dataview-choosing-the-right-buffer-view-in-javascript</link><guid isPermaLink="true">https://pavel-romanov.com/uint8array-vs-dataview-choosing-the-right-buffer-view-in-javascript</guid><category><![CDATA[JavaScript]]></category><category><![CDATA[Node.js]]></category><category><![CDATA[Web Development]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Wed, 07 Aug 2024 14:26:37 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1723039048463/ba5940a9-edd2-4aef-ac92-827ad918119f.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the previous article, we got familiar with <a target="_blank" href="https://pavel-romanov.com/javascript-buffers-explained-why-they-matter-and-how-to-use-them">JavaScript buffers</a>. The lowest possible implementation of a buffer in JavaScript is the <code>ArrayBuffer</code> class. This class is read-only, meaning we don't have any API to write data inside the buffer.</p>
<p>To change buffer content, we have two options: data views and typed arrays. In this article, we'll talk about data views, typed arrays, and their differences.</p>
<h2 id="heading-data-views">Data views</h2>
<p>Data view is the abstraction that allows you to change the content of a buffer. It acts like a key to a locked door. You can't go inside without having a key, but once you have it, feel free to come in.</p>
<p>The same is true for the relation between buffer and data view. Data view is like a key that allows you to write data inside a buffer. Data view is represented by the <code>DataView</code> class in JavaScript.</p>
<p>It holds the key to the underlying <code>ArrayBuffer</code> instance where data is stored.</p>
<p>You can see example in the code snippet below.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> <span class="hljs-built_in">ArrayBuffer</span>(<span class="hljs-number">8</span>);
<span class="hljs-keyword">const</span> view = <span class="hljs-keyword">new</span> <span class="hljs-built_in">DataView</span>(buffer);

<span class="hljs-built_in">console</span>.log(view.buffer.byteLength); <span class="hljs-comment">// Prints 8</span>

view.buffer.resize(<span class="hljs-number">5</span>);

<span class="hljs-built_in">console</span>.log(buffer.byteLength); <span class="hljs-comment">// Prints 5</span>
</code></pre>
<p>Notice that the <code>buffer</code> size changed after the buffer that view references was resized. This means that the view references the same buffer that we passed into the <code>DataView</code> class constructor.</p>
<p>When we want to write into a buffer using data view, we call one of the methods that set different types of values.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> <span class="hljs-built_in">ArrayBuffer</span>(<span class="hljs-number">16</span>);
<span class="hljs-keyword">const</span> view = <span class="hljs-keyword">new</span> <span class="hljs-built_in">DataView</span>(buffer);

view.setUint8(<span class="hljs-number">0</span>, <span class="hljs-number">0x0F</span>);

<span class="hljs-built_in">console</span>.log(view.getUint8(<span class="hljs-number">0</span>)); <span class="hljs-comment">// Prints 15</span>
</code></pre>
<p>When we call the <code>setUnit8(0, 0xF)</code> method, it writes the value of <code>0x0F</code> inside the buffer using 1 byte for it.</p>
<h2 id="heading-typed-arrays">Typed arrays</h2>
<p>The second option to change buffer data is typed arrays. Don't be confused by the <strong>array</strong> part. Typed arrays aren't arrays but <strong>array-like</strong> structures.</p>
<p>What it means is when you use <code>Array.isArray</code> function and pass a typed array inside it returns <code>false</code>. At the same time, typed arrays provide many array-like methods: <code>at</code>, <code>fill</code>, <code>map</code>, <code>reduce</code>, etc. That's why typed arrays are called array-like structures.</p>
<p>In JavaScript, we have many different typed arrays such as:</p>
<ul>
<li><p><code>Uint8Array</code></p>
</li>
<li><p><code>Int8Array</code></p>
</li>
<li><p><code>Uint16Array</code></p>
</li>
<li><p><code>Int16Array</code></p>
</li>
<li><p><code>Float32Array</code></p>
</li>
<li><p><code>Float64Array</code></p>
</li>
</ul>
<p>And others. Every typed array provides the ability to modify the underlying buffer. But what is the difference between all those typed arrays and when to use which?</p>
<p>To understand this topic better, I highly recommend reading the article about <a target="_blank" href="https://pavel-romanov.com/from-ascii-to-unicode-a-javascript-developers-guide-to-text-encoding">different types of text encoding schemes in JavaScript</a>. Knowing text encoding schemes makes it much easier to understand the topic of typed arrays.</p>
<p>There are 3 main characteristics of the typed arrays you should know about:</p>
<ul>
<li><p>Signed and unsigned typed arrays</p>
</li>
<li><p>Number of bytes required to store a single value inside a typed array</p>
</li>
<li><p>Type of value that a typed array can store</p>
</li>
</ul>
<p>Let's look at each of them in more detail.</p>
<h3 id="heading-signed-and-unsigned-typed-arrays">Signed and unsigned typed arrays</h3>
<p>The meaning of the data that the buffer contains heavily depends on the context in which it is interpreted.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> signedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Int8Array</span>([<span class="hljs-number">0xFF</span>, <span class="hljs-number">0x75</span>, <span class="hljs-number">0x6E</span>]);
<span class="hljs-keyword">const</span> unsignedArray = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Uint8Array</span>([<span class="hljs-number">0xFF</span>, <span class="hljs-number">0x75</span>, <span class="hljs-number">0x6E</span>]);

<span class="hljs-comment">// Prints Int8Array {0: -1, 1: 117, 2: 110}</span>
<span class="hljs-built_in">console</span>.log(signedArray);

<span class="hljs-comment">// Prints Uint8Array {0: 255, 1: 117, 2: 110}</span>
<span class="hljs-built_in">console</span>.log(unsignedArray);
</code></pre>
<p>Quick note, if you're not comfortable with those <code>0xFF</code> and other hexadecimal values, check out the article in which we dive into the <a target="_blank" href="https://pavel-romanov.com/numeric-systems-in-javascript-from-fundamentals-to-application">hexadecimal numeric system in JavaScript</a>.</p>
<p>You see that both structures are almost identical. The only difference is that the first element with a value of <code>OxFF</code> in signed arrays is treated as <code>-1</code>, and in unsigned arrays, it is <code>255</code>. That's the whole point of signed vs. unsigned, the range of <em>possible</em> values is different.</p>
<p>Signed arrays can contain both negative and positive values. Unsigned array can only contain positive values. The specific range of values is dictated by how many bytes are used to store a single item.</p>
<h3 id="heading-number-of-bytes-required-to-store-a-value">Number of bytes required to store a value</h3>
<p>Typed arrays store values differently. To be more precise, each typed array allocates a different number of bytes to store a single item.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1723039758036/c896e446-4126-406c-8180-8b407f1a9122.jpeg" alt="The difference between how typed arrays store values" class="image--center mx-auto" /></p>
<p>For the same piece of data, different typed arrays allocate different number of bytes.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://x.com/pavl_ro/status/1820489696801444057">https://x.com/pavl_ro/status/1820489696801444057</a></div>
<p> </p>
<p>To better understand when to use which typed array, you have to understand the data you're working with.</p>
<p>If the data is not expected to exceed the range from <code>0</code> to <code>255</code>, then it is better to use <code>Uint8Array</code>. It operates in this exact range, and it is one of the most memory-efficient type arrays.</p>
<p>If you expect to work with data where some elements can have a value higher than <code>255</code> it is better to use <code>Uint16Array</code> or <code>Uint32Array</code>. If you try to write a value higher than <code>255</code> into an 8-bit unsigned array, the value will be cut at <code>255</code>.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> u8Array = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Uint8Array</span>([<span class="hljs-number">0xFFF</span>]);
<span class="hljs-keyword">const</span> u16Array = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Uint16Array</span>([<span class="hljs-number">0xFFF</span>]);

<span class="hljs-built_in">console</span>.log(u8Array) <span class="hljs-comment">// Prints Uint8Array {0: 255}</span>
<span class="hljs-built_in">console</span>.log(u16Array) <span class="hljs-comment">// Prints Uint16Array {0: 4095}</span>
</code></pre>
<h3 id="heading-different-value-types">Different value types</h3>
<p>Different typed arrays are meant to store different datatypes. So far, we've talked only about integer typed arrays. The integer, in this case, is a number without any floating point numbers like <code>3</code> or <code>120</code>. But what if you want to store values with floating point numbers like <code>3.14</code>?</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> u8array = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Uint8Array</span>([<span class="hljs-number">3.14</span>]);

<span class="hljs-comment">// Prints Uint8Array {0: 3}</span>
<span class="hljs-built_in">console</span>.log(u8Array);
</code></pre>
<p>The part after the floating point gets cut, and you only see <code>3</code>. To store floating point numbers, you have to use specific typed arrays <code>Float32Array</code> or <code>Float64Array</code>.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> float32array = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Float32Array</span>([<span class="hljs-number">3.14</span>]);

<span class="hljs-comment">// Prints Float32Array {0: 3.140000104904175}</span>
<span class="hljs-built_in">console</span>.log(float32array);
</code></pre>
<p>Now you can see the numbers after the floating point.</p>
<h2 id="heading-what-is-the-difference-between-a-data-view-and-a-typed-array">What is the difference between a data view and a typed array</h2>
<p>It looks like we have 2 things that are doing basically the same. Well, it is partially true because both <code>DataView</code> and <code>TypedArray</code> are buffer views.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1723040152869/b9d6ab33-231d-4d0f-977e-e08800d0b1f1.jpeg" alt class="image--center mx-auto" /></p>
<p>As you can see, using both abstractions gives you the same power to edit buffers' content. At the same time, there are differences in <em>how</em> they're editing buffer content.</p>
<h3 id="heading-scope-of-manipulated-data">Scope of manipulated data</h3>
<p>When dealing with typed arrays, we always work with a specific type of data. For example, using <code>Uint8Array</code> means that we only work with data range between <code>0</code> and <code>255</code>.</p>
<p>Typed arrays are quite convenient when we work only with a single type of data per buffer. However, this is not the case if we want to write different types of data inside a buffer. It is still possible to use typed arrays for it, but the code becomes more tedious.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> <span class="hljs-built_in">ArrayBuffer</span>(<span class="hljs-number">10</span>);
<span class="hljs-keyword">const</span> u8Array = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Uint8Array</span>(buffer, <span class="hljs-number">0</span>, <span class="hljs-number">3</span>);
<span class="hljs-keyword">const</span> u16Array = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Uint16Array</span>(buffer, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>);

u8Array.fill(<span class="hljs-number">0xFF</span>);
u16Array.fill(<span class="hljs-number">0xFFF</span>);

<span class="hljs-comment">// Prints Uint8Array {0: 255, 1: 255, 2: 255}</span>
<span class="hljs-built_in">console</span>.log(u8Array);

<span class="hljs-comment">// Prints Uint16Array {0: 4095, 1: 4095, 2: 4095}</span>
<span class="hljs-built_in">console</span>.log(u16Array);
</code></pre>
<p>Using <code>DataView</code> is more convenient in such cases because you're not constrained by any particular type of data.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> <span class="hljs-built_in">ArrayBuffer</span>(<span class="hljs-number">10</span>);
<span class="hljs-keyword">const</span> view = <span class="hljs-keyword">new</span> <span class="hljs-built_in">DataView</span>(buffer);

view.setUint8(<span class="hljs-number">0</span>, <span class="hljs-number">0xFF</span>);
view.setUint16(<span class="hljs-number">1</span>, <span class="hljs-number">0xFFFF</span>);
</code></pre>
<p>You don't need to work with many abstractions and variables. <code>ArrayBuffer</code> and <code>DataView</code> are enough for this task.</p>
<h3 id="heading-the-difference-in-how-a-data-view-and-a-typed-array-treat-endianness">The difference in how a data view and a typed array treat Endianness</h3>
<p>If you're not familiar with Endianness, check out this section of the article on <a target="_blank" href="https://pavel-romanov.com/from-binary-to-code-why-javascript-devs-need-to-know-bits-and-bytes#heading-bytes-and-memory">bits and bytes in JavaScript</a>.</p>
<p>By default, typed arrays only work with the native Endianness of your platform. For example, if you have little-endian hardware, then typed arrays use little-endian byte order when working with typed arrays.</p>
<p>Most modern machines use little-endian bytes order. However, it doesn't mean there is no place for big-endian.</p>
<p>In <a target="_blank" href="https://stackoverflow.com/questions/7869752/javascript-typed-arrays-and-endianness">this</a> StackOverflow question user faces the problem where a big-endian WebGL file is interpreted in little-endian using typed arrays. It happens because the native system byte order is little-endian.</p>
<p>To solve the issues, we can use <code>DataView</code>. Data views allow you to change the way how the view treats the buffer byte order.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> <span class="hljs-built_in">ArrayBuffer</span>(<span class="hljs-number">10</span>);
<span class="hljs-keyword">const</span> view = <span class="hljs-keyword">new</span> <span class="hljs-built_in">DataView</span>(buffer);

view.setUint16(<span class="hljs-number">0</span>, <span class="hljs-number">0xFFF</span>, <span class="hljs-literal">true</span>);
</code></pre>
<p>When we pass <code>true</code> as the third argument to <code>DataView</code> instance methods, it means that we intend to store the data in little-endian byte order. By default, <code>DataView</code> uses big-endian byte order to store the data.</p>
<p>Notice that methods which set 8-bit values like <code>setUint8</code> and <code>setInt8</code> don't have this flag. The reason is simple, there is only 1 byte, and the byte order is irrelevant in this case.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Views allow you to change the content of a buffer. There are two types of views: typed arrays and data views.</p>
<p>Data view is represented by the <code>DataView</code> class in JavaScript. There is only one class when it comes to data view, and through this class, we can set many different values for the buffer.</p>
<p>On the other hand, typed arrays are represented by multiple different class that differs by:</p>
<ul>
<li><p>Signed and unsigned type</p>
</li>
<li><p>Number of bytes required to store a single buffer item</p>
</li>
<li><p>Type of data stored like integers, floats, big integers</p>
</li>
</ul>
<p>The difference between data view and typed array lies in how flexible you want to be when working with buffers. If you want to primarily work with a single type of data and ok without having much flexibility, then typed arrays are your choice.</p>
<p>On the other hand, if you need a more flexible solution or you want to work with different types of data inside a single buffer, then data views are the way to go.</p>
]]></content:encoded></item><item><title><![CDATA[JavaScript Buffers Explained: Why They Matter and How to Use Them]]></title><description><![CDATA[What do video processing, 3D and 2D graphics, cryptography, and network protocols have in common? They all use buffers.
It is a low-level abstraction that makes it possible to tap into efficient algorithms and direct memory management. There are no t...]]></description><link>https://pavel-romanov.com/javascript-buffers-explained-why-they-matter-and-how-to-use-them</link><guid isPermaLink="true">https://pavel-romanov.com/javascript-buffers-explained-why-they-matter-and-how-to-use-them</guid><category><![CDATA[Node.js]]></category><category><![CDATA[JavaScript]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Tue, 30 Jul 2024 15:08:01 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1722350449392/7e091a2a-4f21-46a5-83b3-a2f0ab606036.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>What do video processing, 3D and 2D graphics, cryptography, and network protocols have in common? They all use buffers.</p>
<p>It is a low-level abstraction that makes it possible to tap into efficient algorithms and direct memory management. There are no tools in JavaScript that offer the same level of low-level programming as buffers.</p>
<p>This article gives you the answer to the question, "Why do I need to learn buffers?" It walks you through how buffer works on a conceptual level and then goes into detail on how to use it in JavaScript.</p>
<h2 id="heading-why-you-need-to-understand-buffers">Why you need to understand buffers</h2>
<p>Buffers became an irreplaceable part of modern JavaScript. Some tasks simply can't be done without using them. Here are just a few areas where buffers play a crucial part.</p>
<h3 id="heading-working-with-files">Working with files</h3>
<p>You might not realize it yet, but when you work with files in JavaScript, you work with buffers. Any file on your computer is just a set of 0s and 1s. In other words, it is a raw binary.</p>
<p>The <code>File</code> JavaScript class is just an abstraction over this raw binary to make things easier to deal with. Don't believe me? Let's create a file from absolute zero using <code>ArrayBuffer</code> as a raw binary base for a file.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> <span class="hljs-built_in">ArrayBuffer</span>(<span class="hljs-number">8</span>);
<span class="hljs-keyword">const</span> file = <span class="hljs-keyword">new</span> File([buffer], <span class="hljs-string">'new-file'</span>);

<span class="hljs-comment">// File size - 8 bytes, File name - new-file</span>
<span class="hljs-built_in">console</span>.log(file.size, file.name);
</code></pre>
<p>You can even save this file on your computer.</p>
<h3 id="heading-image-video-and-audio-processing">Image, video, and audio processing</h3>
<p>Buffers are irreplaceable when it comes to media file processing because you have access to the raw binary of a file. A raw binary contains a bunch of different information like compression method, color depth, metadata, etc.</p>
<p>The <a target="_blank" href="https://www.npmjs.com/package/music-metadata">music-metadata</a> library is a JavaScript for Node.js that allows you to get so much additional information from an audio file.</p>
<ul>
<li><p>Codec</p>
</li>
<li><p>Bitrate</p>
</li>
<li><p>Frame headers</p>
</li>
</ul>
<p>And so much more. There is simply no ready-to-use API to get all this info except for the raw binary itself.</p>
<p>There is also the <code>AudioBuffer</code> class <a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/API/AudioBuffer">provided</a> by the Web Platform. All audio data in the audio buffer is stored in the same old <code>ArrayBuffer</code>. How do we know that? Instances of the <code>AudioBuffer</code> class have a method called <code>getChannelData</code> , and it returns the underlying data as a typed array.</p>
<h3 id="heading-network-protocols">Network protocols</h3>
<p>On the Internet, we communicate using different network protocols:</p>
<ul>
<li><p>Transmission Control Protocol (TCP)</p>
</li>
<li><p>File transfer protocol (FTP)</p>
</li>
<li><p>Hypertext transfer protocol (HTTP)</p>
</li>
</ul>
<p>Notice that all mentioned protocols have a word related to transfer. But what do we transfer? The answer is raw binary. And you already know when there is a raw binary, there is a buffer.</p>
<p>One of the most popular WebSocket libraries, <a target="_blank" href="https://github.com/socketio/socket.io">Socket.io,</a> uses buffers heavily <a target="_blank" href="https://github.com/search?q=repo:socketio/socket.io+buffer&amp;type=code">across</a> the library. It also makes it possible to <a target="_blank" href="https://socket.io/blog/introducing-socket-io-1-0/#binary-support">send a raw binary</a>. There is no way to support such features without using buffers.</p>
<h3 id="heading-3d-and-2d-graphics-game-development">3D and 2D graphics, game development</h3>
<p>3D and 2D graphics are never easy to get right. There are many things that you have to be aware of to make a good graphic or game. One of such is the texture. The texture is simply a surface of a model.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1722351120443/e8a4fbec-0dba-4ef3-8f60-98d022997a76.webp" alt class="image--center mx-auto" /></p>
<p>Source: <a target="_blank" href="https://3d-ace.com/3d-modeling/">https://3d-ace.com/3d-modeling/</a></p>
<p>You can clearly see the difference between a plain model on the left and a model with texture on the right. There are dedicated 3D libraries in JavaScript that make it possible to work with all the complexity of 3D graphics, and of course, they use buffers for it.</p>
<p>Here is <a target="_blank" href="https://github.com/BabylonJS/Babylon.js/blob/f38e0ede2e25cdaf2ce8b5ba26615d1747c3f7b1/packages/dev/core/src/Misc/dds.ts#L232">an example</a> of the DDS file format decompression abstraction provided by the <a target="_blank" href="https://github.com/BabylonJS/Babylon.js">BabylonJS</a> library. It is a special file format for model textures. You can instantly notice one thing—it uses buffers <strong>a lot</strong>. The reason is simple: we're working with decompression of the <em>raw binary</em>.</p>
<h3 id="heading-cryptography">Cryptography</h3>
<p>One of the main requirements in the cryptography industry for code is speed and efficiency. Buffers in JavaScript offer exactly that:</p>
<ul>
<li><p>Buffers provide a way to represent fixed-length binary data, which is perfect for crypto keys, hashes, or encrypted data.</p>
</li>
<li><p>It is possible to allocate memory more precisely and in smaller amounts for the same data using buffers.</p>
</li>
<li><p>Working with the raw binary makes it easier to use bit-wise operators, which are faster than regular ones.</p>
</li>
</ul>
<p>In the previous article about <a target="_blank" href="https://pavel-romanov.com/from-binary-to-code-why-javascript-devs-need-to-know-bits-and-bytes">bits and bytes in JavaScript</a>, we talked about how much space the V8 JavaScript engine allocates for numbers.</p>
<p>In short, it is either 31 or 64 bits. For numbers without a floating point number, the engine allocates 31 bits. But what if we only work with numbers from 0 to 255? It'll still use 31 bits per each number.</p>
<p>We can do it with almost 4 times less memory using a buffer.</p>
<pre><code class="lang-typescript"><span class="hljs-comment">// 92 bits for numbers + array.</span>
<span class="hljs-keyword">const</span> numbers = [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>];

<span class="hljs-comment">// 24 bits for numbers + typed array.</span>
<span class="hljs-keyword">const</span> u8Array = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Uint8Array</span>([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>]);
</code></pre>
<p>The <code>Uint8Array</code> uses <code>ArrayBuffer</code> under the hood to store the data. We'll talk about it in more detail in the upcoming article on buffer views.</p>
<h2 id="heading-javascript-buffer-is-an-abstraction">JavaScript buffer is an abstraction</h2>
<p>Buffer is an abstraction. This abstraction consists of two specific parts: memory chunk and handle to this memory.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1722351516759/d6f86777-1a7e-4b7d-a305-6b8d4410d8ce.jpeg" alt="Two main parts of buffer abstraction" class="image--center mx-auto" /></p>
<p>Buffer can allocate a specific amount of memory and store the reference to this memory in JavaScript.</p>
<p>We're talking about direct memory allocation from a high-level language perspective that is supposed to abstract us from direct interaction from memory. Turns out it is not always possible.</p>
<p>The memory handle is the buffer JavaScript object we can directly interact with in our applications. If you're familiar with lower-level programming languages like C or C++, it looks pretty much like a safe version of a memory pointer.</p>
<h3 id="heading-javascript-buffer-representationarraybuffer">JavaScript buffer representation—ArrayBuffer</h3>
<p>We've talked about buffer structure and how it works in theory. The next step is to look at the particular buffer implementation in JavaScript. It is called <a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer">ArrayBuffer</a>.</p>
<p>The <code>ArrayBuffer</code> is a class that we use to create a buffer.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> <span class="hljs-built_in">ArrayBuffer</span>(<span class="hljs-number">4</span>);
</code></pre>
<p>The number <code>4</code> we passed to the constructor indicates how many <em>bytes</em> you want to allocate in memory for this buffer. You can use the <code>byteLength</code> property to check if the buffer is of the right size.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> <span class="hljs-built_in">ArrayBuffer</span>(<span class="hljs-number">4</span>);

<span class="hljs-built_in">console</span>.log(buffer.byteLength); <span class="hljs-comment">// Prints 4</span>
</code></pre>
<p>By default, <code>ArrayBuffer</code> instances are immutable. We can't directly write any data inside the buffer. But what is the point of having a chunk of memory and not being able to work with it? That's why we have <strong>views</strong>.</p>
<p>A view is a mechanism that makes it possible to modify a buffer content. We'll talk about views in more detail in the upcoming article.</p>
<p>The only thing we can change when working with <code>ArrayBuffer</code> is the size. To do so, we have to specify the <code>maxByteLength</code> option. The option tells the buffer that it can't grow further than this number.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> <span class="hljs-built_in">ArrayBuffer</span>(<span class="hljs-number">4</span>, { maxByteLength: <span class="hljs-number">8</span> });

<span class="hljs-built_in">console</span>.log(buffer.byteLength); <span class="hljs-comment">// Prints 4</span>

buffer.resize(<span class="hljs-number">8</span>);

<span class="hljs-built_in">console</span>.log(buffer.byteLength); <span class="hljs-comment">// Prints 8</span>
</code></pre>
<p>If you try to resize the buffer to more than 8 bytes, the <code>resize</code> function throws a <code>RangeError</code> saying that the value is invalid.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> <span class="hljs-built_in">ArrayBuffer</span>(<span class="hljs-number">4</span>, { maxByteLength: <span class="hljs-number">8</span> });

<span class="hljs-comment">// RangeError: ArrayBuffer.prototype.resize: Invalid length parameter</span>
buffer.resize(<span class="hljs-number">10</span>);
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Buffers are essential when it comes to low-level things like:</p>
<ul>
<li><p>Video, image, and audio processing</p>
</li>
<li><p>3D and 2D graphics and game development</p>
</li>
<li><p>Cryptography</p>
</li>
<li><p>Network protocols</p>
</li>
<li><p>General work with files</p>
</li>
</ul>
<p>A buffer is an abstraction that consists of 2 main parts: memory allocated for this particular buffer and JavaScript handle to the memory. Allocated memory for the buffer is not subject to garbage collection, but the JavaScript handle is. Whenever the JavaScript handle gets destroyed, it frees the allocated memory.</p>
<p>In JavaScript, the lowest possible abstraction of a buffer is called <code>ArrayBuffer</code>. It is an immutable structure that can't be directly modified. That's why we use views to modify a buffer.</p>
]]></content:encoded></item><item><title><![CDATA[From ASCII to Unicode: A JavaScript Developer's Guide to Text Encoding]]></title><description><![CDATA[Have you ever seen things like ASCII or UTF-8? At least you should've seen the latter one. All modern IDEs and code editors display some sort of UTF when you work with files.

Most of the time, you pay little attention to it because it doesn't direct...]]></description><link>https://pavel-romanov.com/from-ascii-to-unicode-a-javascript-developers-guide-to-text-encoding</link><guid isPermaLink="true">https://pavel-romanov.com/from-ascii-to-unicode-a-javascript-developers-guide-to-text-encoding</guid><category><![CDATA[Node.js]]></category><category><![CDATA[JavaScript]]></category><category><![CDATA[Web Development]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Thu, 18 Jul 2024 16:34:38 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1721320423158/fb9437be-2e69-42fc-9ae5-66e7d231eee9.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Have you ever seen things like ASCII or UTF-8? At least you should've seen the latter one. All modern IDEs and code editors display some sort of UTF when you work with files.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1721318784916/780f0d6c-a6bd-4784-ad13-a0689daeda4c.png" alt="How modern IDEs display file encoding" class="image--center mx-auto" /></p>
<p>Most of the time, you pay little attention to it because it doesn't directly affect your work. What if I tell you that it is the opposite that understanding what exactly UTF-8 and other UTFs mean directly impacts your job?</p>
<p>Just imagine that simply changing UTF-8 to UTF-16 increases your file size 2 times. Do you want to know why?</p>
<p>In this article, we'll dive into what the ASCII and UTFs are, how we're using them on a daily basis, and what problems misusing an encoding scheme can cause.</p>
<h2 id="heading-bytes-and-characters">Bytes and characters</h2>
<p>When you look at any text on modern devices, you see words. Each of those words consists of individual characters. Have you ever thought about what character is?</p>
<p>There are at least two major parts of it. The first one is "how" the character looks like. It could be the same character "A" but in different fonts or in the same font but in regular, bold, or italic variants.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1721318953037/2ec07059-5ad2-4238-8ccb-8549ca96a3ea.jpeg" alt="Character &quot;A&quot; in in 3 different fonts" class="image--center mx-auto" /></p>
<p>The thing we see is called a glyph. Different fonts have different glyphs for the same character. You can compare a glyph to an application frontend. We can display the same value that comes from a backend in different shapes and forms. The same is true for a glyph.</p>
<p>But what is a backend in this case? Let's call it a character code. The character code is a unit of information that allows different glyphs to represent the same character.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1721319159388/1261ab85-bd88-450e-8973-c8838332e88a.jpeg" alt="Concept of character code shown on character A" class="image--center mx-auto" /></p>
<p>In the <a target="_blank" href="https://pavel-romanov.com/from-binary-to-code-why-javascript-devs-need-to-know-bits-and-bytes">previous article</a>, we discussed bits and bytes, and how understanding them can help you write better JavaScript code.</p>
<p>We can apply this knowledge directly to characters to understand them better. Each character you see on the internet has the actual size in bytes. Knowing how many characters a file contains makes it easy to calculate file size. If there are 1,000,000 characters and each character takes 1 byte to store, then the file size is 1,000,000 bytes or 1 megabyte.</p>
<p>Another application of this knowledge is related to the <a target="_blank" href="https://pavel-romanov.com/numeric-systems-in-javascript-from-fundamentals-to-application">binary numeric system</a> and how bytes can be represented in binary. Here is an example of how a single byte is represented in a binary numeric system - <code>11111111</code>. Any eight binary numbers represent a single byte value.</p>
<p>A character code can be represented in a binary numeric system as well. Here is how the popular "Hello world" phrase looks in binary.</p>
<pre><code class="lang-plaintext">01101000 01100101 01101100 01101100 01101111 00100000 01110111 01101111 01110010 01101100 01100100
</code></pre>
<p>You can tell how many bytes are there by counting groups of 8 characters. In this case, there are 11 groups, which means there are 11 bytes.</p>
<p>When dealing with binary, context is the king. Context is what makes the difference between just 0s and 1s and commands of some programming language or a file. An encoding scheme is a context that allows us to turn a set of binary numbers into a human-readable phrase.</p>
<p>The encoding example of the “Hello world” phrase into its binary representation actually uses one of those schemes called American Standard Code for Information Interchange (<a target="_blank" href="https://www.ascii-code.com">ASCII</a>).</p>
<h2 id="heading-from-ascii-to-unicode">From ASCII to Unicode</h2>
<p>ASCII is an encoding scheme formalized in 1967. It remains one of the most significant, if not the most significant, standards in the tech industry. The reason is simple: it is the first widespread encoding standard specifically developed for the tech industry.</p>
<p>ASCII has two versions: the base version, which contains 128 characters, and the extended version, which contains 255 characters. Both nicely fit in 8 bits or one byte of information.</p>
<p>This is how you encode “A”, “9”, and “/” characters in ASCII.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1721319405255/4a6bc279-4905-440d-a47b-a549c1c80938.jpeg" alt="Example of ASCII encoding for A, 9, and / characters" class="image--center mx-auto" /></p>
<p>As the name suggests, the encoding was developed in the US and designed specifically for English. 255 characters are enough to encode most English words and sentences.</p>
<p>In fact, ASCII became so popular that people from other countries started using it. But you can’t use English-based standards for other languages encoding that easily because of the limited characters set.</p>
<p>Trying to solve the problem of limited ASCII characters set, people have created supersets of ASCII like Japanese Industrial Standard (<a target="_blank" href="https://en.wikipedia.org/wiki/JIS_encoding">JIS</a>) to be able to use the popular format with customizations for their needs.</p>
<p>However, the problem was clear: ASCII is too limited. In 1988, the successor of ASCII was created, called <a target="_blank" href="https://home.unicode.org/about-unicode/">Unicode</a>.</p>
<p>Unlike ASCII, Unicode was initially developed as a 2-byte encoding. In the extended version, 1-byte ASCII can encode up to 255 characters, while 2-byte Unicode can encode 65,536 characters.</p>
<p>That’s a decent leap. This version of Unicode allows the encoding of most of the widely used characters in the most popular languages.</p>
<p>A unique feature of Unicode is its extendability. It is not a fixed standard, and adding new languages can be easily achieved if there is a demand.</p>
<h2 id="heading-unicode">Unicode</h2>
<p>Unicode is an engineering masterpiece. Let’s look closely at what makes it so unique and why it became so important.</p>
<h3 id="heading-backward-compatibility-with-ascii">Backward compatibility with ASCII</h3>
<p>One of the main goals of creating Unicode was to create a standard for encoding a vast amount of different information. The goal has been achieved.</p>
<p>At the time of Unicode's creation, a lot of information was produced using ASCII. It wasn’t an option to just drop support of everything people had created so far and adopt a new standard.</p>
<p>That’s why the first 128 characters of Unicode are the same as ASCII characters. This makes Unicode backward compatible with ASCII, and the transition from ASCII to Unicode is seamless.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1721319586335/1ae27b7b-60cf-45ee-9901-92fc915874b5.jpeg" alt="Example of how ASCII and Unicode are backward compatible" class="image--center mx-auto" /></p>
<h3 id="heading-unicodes-transformation-format">Unicode's transformation format</h3>
<p>Unicode was initially developed as a 16-bit or 2-byte encoding standard. This amount of information was enough to encode most of the popular languages.</p>
<p>However, it is not enough to encode all possible information. Old scriptures, dead languages, and emojis are just a few examples of information missing from the initial standard. That’s why it is now not a 16-bit encoding but a 21-bit encoding and has more room for growth if needed.</p>
<p>What it means is that every character in a text is encoded using 21-bit.</p>
<p>But what if your text is in plain English, contains no special characters, and can be encoded using the first 128 characters of Unicode? It would be nicer to encode it in ASCII, where each character takes only 8 bits and 2.5 times less space to store.</p>
<p>Unicode is a flexible encoding, and thanks to different Unicode transformation formats, or UTFs, you can encode text using different numbers of bits. There are three major formats: UTF-8, UTF-16, and UTF-32. The number indicates how many bits are used to encode and store a single character.</p>
<p>You can use UTF-8 for plain English text. It takes exactly 8 bits to store every character in this encoding. But what if your text contains some character beyond the scope of the first 128? You can still use UTF-8 because it is a flexible standard, and depending on the character, it allocates a dedicated space to store the character.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1721319688239/8312bfbf-fd11-4bdf-8f2f-f900a3bf6434.jpeg" alt="UTF-8 memory allocation for different characters" class="image--center mx-auto" /></p>
<p>In this example, we use UTF-8 to encode all characters. Each character in the word “Hello” is encoded using only 1 byte. However, the Thai Ko Kai (ก) character is encoded with 3 bytes using the same encoding scheme.</p>
<p>However, Thai characters don't always occupy 3 bytes of memory to store. When using UTF-16, each character takes only 2 bytes to store.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1721319734453/b21fe8c9-7ee6-4d98-9da3-b8d320b85c73.jpeg" alt="Memory allocation difference between UTF-8 and UTF-16" class="image--center mx-auto" /></p>
<p>That’s why UTF-16 and UTF-32 transformation formats are still valuable and not going anywhere despite vast UTF-8 adaptation.</p>
<p>If you know that the text you’re dealing with is fully written in Thai, it doesn’t make sense to use UTF-8 as encoding. It will work, but it also takes ~30% more space than using UTF-16.</p>
<p>It works in another direction as well. Using UTF-16 for text in plain English makes It is two times larger in byte size than using UTF-8 because each character is encoded with at least 2 bytes.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1721319771000/c3429023-e539-4ae7-b642-27c4340b570e.jpeg" alt="Memory allocation difference between all UTF formats" class="image--center mx-auto" /></p>
<h3 id="heading-code-point-and-code-unit">Code point and Code unit</h3>
<p>Each character in Unicode has a unique numerical identifier. Such a unique identifier is called a code point.</p>
<p>Every code point is unique, despite the UTF you’re working with. Every code point is written in the following format: <code>U+XXXX</code> where <code>XXXX</code> is a hexadecimal number. The range of unique code points goes from <code>U+XXXX</code> to <code>U+10XXXX</code>. For example, the code point for the character “A” has code U+0041.</p>
<p>While a code point is related to all Unicode characters despite their UTF, a code unit is specific to a particular UTF. Depending on the encoding, a code point may be represented by one or more code units.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1721319832263/ebd924d7-9f17-4ce6-a837-3965da0bf917.jpeg" alt="Difference in code points for the same ก character between UTF-8 and UTF-16" class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1721319847388/c4ba97e5-cbe0-406c-ad0f-065860c89162.jpeg" alt="Detailed breakdown of character ก code points in UTF-8 encoding" class="image--center mx-auto" /></p>
<h3 id="heading-unicode-planes">Unicode planes</h3>
<p>When Unicode was first introduced in the 1980s, its creators believed that 65,536 code points would be enough to encode all the world's popular writing systems.</p>
<p>This initial set of code points is now known as the Basic Multilingual Plane (BMP). BPM contains characters you use every day, including Latin letters, common symbols, and characters from widely used non-Latin scripts.</p>
<p>However, as the standard progressed, it became clear that more space would be needed. In Unicode 2.0 (1996), supplementary planes were introduced, expanding from one multilingual plane to 17.</p>
<p>Each plain contains 65,536, which extends the initial capacity to 1,114,112. This expansion was crucial for several reasons:</p>
<ul>
<li><p>Accommodating complex writing systems: Scripts like Han (used in Chinese, Japanese, and Korean) required far more characters than initially anticipated.</p>
</li>
<li><p>Future-proofing: The additional planes provided room for newly discovered historical scripts and potential future writing systems.</p>
</li>
<li><p>Special-purpose characters: Planes were allocated for technical symbols, emoji, and private-use characters.</p>
</li>
</ul>
<p>The introduction of supplementary planes marked a significant milestone in Unicode's development, transforming it from a limited character encoding system to a comprehensive standard capable of representing virtually all known writing systems.</p>
<h2 id="heading-utf-encoding-and-javascript">UTF encoding and JavaScript</h2>
<p>Now, it is time to look at encodings in the context of JavaScript.</p>
<p>JavaScript internally <a target="_blank" href="https://tc39.es/ecma262/#sec-source-text">uses</a> UTF-16 encoding for strings. I mean for <strong>any</strong> string, even if the string came from a file, network, or anywhere else. If the string somehow comes into the JavaScript world, be sure that it is always encoded using UTF-16.</p>
<p>It is just a specification requirement, and we can do little about it. The positive side is that things just get simpler. We work with one encoding and one encoding only.</p>
<p>If we decide to create a variable that represents an error message and the error message text is 20 characters long, you can be sure that the size it takes to store this string is precisely 40 bytes.</p>
<pre><code class="lang-typescript"><span class="hljs-comment">// The variable string content occupies 40 bytes of memory</span>
<span class="hljs-keyword">const</span> errorMessages = <span class="hljs-string">'Something went wrong'</span>;
</code></pre>
<p>At the same time, there is a way to encode a string in JavaScript using less memory. It is possible to do so only using buffers.</p>
<pre><code class="lang-typescript"><span class="hljs-comment">// "Sun" string encoded in hexadecimal numeric system</span>
<span class="hljs-keyword">const</span> buffer = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Uint8Array</span>([<span class="hljs-number">0x53</span>, <span class="hljs-number">0x75</span>, <span class="hljs-number">0x6f</span>, <span class="hljs-number">0x6e</span>]);

<span class="hljs-comment">// The default decoding scheme is UTF-8</span>
<span class="hljs-keyword">const</span> decoder = <span class="hljs-keyword">new</span> TextDecoder();

<span class="hljs-built_in">console</span>.log(decoder.decode(buffer)) <span class="hljs-comment">// Prints "Sun";</span>
</code></pre>
<p>You don't need to understand the whole buffer workflow just yet. We'll talk about it in a future article.</p>
<p>The string that we get from the <code>decoder.decode()</code> function is in UTF-8 encoding before it gets to JavaScript, after that it gets UTF-16 encoded again. It happens because of how the API and the whole buffer thing work.</p>
<p>The interesting thing is if we mismatch the type of encoding and decoding schemes, we'll get completely unexpected results.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://x.com/pavl_ro/status/1813610479014789455">https://x.com/pavl_ro/status/1813610479014789455</a></div>
<p> </p>
<p>The data we save in the buffer is the same, and the type of buffer is the same, but the decoding scheme is different. Because of that, we're getting a completely unexpected result.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>ASCII and UTFs are encoding schemes that allow text information to be shared across different machines and the Internet without losing any information.</p>
<p>With Unicode, we can encode up to 1,114,112 characters that are more than enough for the foreseeable future. The Unicode standard consists of multiple parts, such as Code points, code units, planes, etc.</p>
<p>Unicode's transformation formats (UTFs) provide the ability to encode the same exact text using different schemes.</p>
<p>Internally, JavaScript uses UTF-16 for all strings. However, it doesn't mean we can't use different encodings to store strings in a format that we want.</p>
<p>You have to be mindful when working with different encodings because using mismatching encoding and decoding schemes can lead to unexpected results.</p>
]]></content:encoded></item><item><title><![CDATA[From Binary to Code: Why JavaScript Devs Need to Know Bits and Bytes]]></title><description><![CDATA[Understanding what bit and byte are might sound more like a computer science topic that is mostly irrelevant to JavaScript developers. However, this is not true. It is a practical skill that not only makes you a better software developer, but enables...]]></description><link>https://pavel-romanov.com/from-binary-to-code-why-javascript-devs-need-to-know-bits-and-bytes</link><guid isPermaLink="true">https://pavel-romanov.com/from-binary-to-code-why-javascript-devs-need-to-know-bits-and-bytes</guid><category><![CDATA[Node.js]]></category><category><![CDATA[JavaScript]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Tue, 02 Jul 2024 16:37:50 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1719936945326/17b25a8b-5243-4f16-a155-f81ef30eb5c8.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Understanding what bit and byte are might sound more like a computer science topic that is mostly irrelevant to JavaScript developers. However, this is not true. It is a practical skill that not only makes you a better software developer, but enables you to solve a real-world problems in JavaScript.</p>
<p>This article will explain the basics of bits and bytes and how computers use them. After that, you'll see why it is important to understand this topic and how to apply this knowledge in JavaScript.</p>
<h2 id="heading-bit">Bit</h2>
<p>In the previous article, we learned about <a target="_blank" href="https://pavel-romanov.com/numeric-systems-in-javascript-from-fundamentals-to-application">different numeric systems</a>, how they can be used, and why we need them. One of the numeric systems was the binary numeric system.</p>
<p>The binary numeric system has only two numbers: 0 and 1. At its most fundamental level, everything on your computer is just 0s and 1s. Only two numbers are enough to build any software you can imagine.</p>
<p>This is possible because the combination of 0 and 1 is considered the smallest and fundamental unit of information, also known as bit.</p>
<p>For example, with only one bit, you can represent the following information:</p>
<ul>
<li><p>To conclude, whether some statement is true or not. It is either true or false.</p>
</li>
<li><p>To answer if somebody asks you for a donation. It is either yes or no.</p>
</li>
<li><p>To sign an agreement. You either sign—agreeing with it or don't sign.</p>
</li>
</ul>
<p>All the above examples have only two possible outcomes, which are 0 or 1. True or false.</p>
<p>One of the interesting aspects of a bit is that the meaning of a particular bit is completely contextual. The same bit in different contexts means completely different things.</p>
<p>Let's continue with the signature example. Writing your signature on a blank piece of paper has little value because there is no context to it. If you leave your signature under the employment contract, it means that you accept all contract conditions. Leaving the same signature under a bank check makes this check valid for cashing.</p>
<p>You see how the information is the same in all three cases. However, the interpretation of this information is completely different and solely depends on the context.</p>
<p>Having said that, you can convey only a tiny amount of information using one bit. If you want to know where three of your friends donated to a charity, you can use 3 bits for it:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Name</td><td>Action</td><td>Result in binary</td></tr>
</thead>
<tbody>
<tr>
<td>Tom</td><td>Donated</td><td>1</td></tr>
<tr>
<td>Julia</td><td>Donated</td><td>1</td></tr>
<tr>
<td>Marry</td><td>Didn't donate</td><td>0</td></tr>
</tbody>
</table>
</div><p>Notice, there are no complex decision trees or anything like that. It is a simple true or false answer.</p>
<h2 id="heading-byte">Byte</h2>
<p>The byte is a group of 8 bits. It was introduced to deliver more information in a standardized way. Byte simplifies the abstraction in the same way as plain numbers do. Imagine if you have one million pens. What a nice word, a million. It wouldn't be that easy to operate with it if you were to represent the number as one thousand thousand, right?</p>
<p>A byte has a range of possible values, starting from 00000000 to 11111111 or 256 decimal values ranging from 0 to 255.</p>
<p>A single byte is enough to store most English letters. We can introduce an additional byte when it is not enough, like in the case of the Chinese language. With two bytes, you have 65,536 possible options to encode a character. We'll talk more about encoding in the next article.</p>
<p>Another common use case of a byte is color, particularly RGB (red, green, blue) color model. A single byte represents each letter in the RGB model, or 3 bytes for the final color. Only 3 bytes make it possible to have 16,777,216 different colors.</p>
<p>When talking about computers, there are displays with different color models; some of them are RGB displays. Every pixel in such displays takes the exact 3 bytes for an RGB color.</p>
<h2 id="heading-bytes-and-memory">Bytes and memory</h2>
<p>Now we know that bit is the smallest unit of information possible, and a single byte is a group of 8 bits. One of the practical applications of this knowledge is related to files and file systems on a computer.</p>
<p>A text file? Depending on the encoding, each letter could take one, two, or three bytes.</p>
<p>An image? Images are composed of pixels, and every pixel is basically a set of bytes. The exact number of bytes required to store a single pixel heavily varies on a color model.</p>
<p>Every byte of a text file, image, video, audio, etc. is directly stored on your computer. When you download a file, it shows you its size so you can understand how much space it takes on your computer.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1719937558691/cb656011-782d-4e8a-89b6-ffbb77a2eb17.jpeg" alt="Representation of how text and image files are using bytes to store information" class="image--center mx-auto" /></p>
<p>It is the fundamental concept, and <strong>every</strong> operating system works in the same way. What is different, though, is the order in which bytes are stored. The concept describing the byte order is called <strong>endianness</strong>.</p>
<p>There are two types of endianness: big-endian and little-endian. Big-endian stores the most significant byte first. The most significant byte is the one that plays the most significant role in a piece of data.</p>
<p>For example, ISO date (2060-10-22) follows the big-endian because the year, the most significant part of the date, is placed in the first place, followed by months, and only then the day of the month—the least significant number.</p>
<p>On the other hand, if you take a look at the standard European way of writing dates (24 July 2060), you'll see that it follows the little-endian. The least significant number—the day of the months—is first, followed by the number of months, and only then goes the year.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1719937660850/af6ffec7-4b34-4171-a860-2ab35f6488da.jpeg" alt="Difference between big endian and little endian on different types of dates" class="image--center mx-auto" /></p>
<h2 id="heading-why-choose-hexadecimal-over-binary">Why choose hexadecimal over binary</h2>
<p>Here is how an encoded text looks in the binary numeric system.</p>
<pre><code class="lang-plaintext">01001000 01100101 01101100 01101100 01101111 00100000 01110111 01101111 01110010 01101100 01100100 00100001
</code></pre>
<p>The code above uses ASCII/UTF-8 encoding. When you decode the binary to a string, you'll see "Hello world!" But what if we have more than two words? The binary gets massive fast. To address this problem, you can use a hexadecimal numeric system instead.</p>
<p>Here is how the same text is encoded in hexadecimal:</p>
<pre><code class="lang-plaintext">48 65 6c 6c 6f 20 77 6f 72 6c 64 21
</code></pre>
<p>It is 4 times shorter, but the meaning is the same.</p>
<p>The other nice thing about binary-to-hexadecimal conversion is that every byte can be represented by only two hexadecimal numbers. The smallest possible value of a byte 00000000 is written in hexadecimal as 00, and the highest possible value of 11111111 is just FF.</p>
<p>The short alternative to binary is the exact reason why hexadecimal has become so widely used.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1719937830159/76952f2b-d837-4a1a-b696-55697a5e2a79.jpeg" alt class="image--center mx-auto" /></p>
<h2 id="heading-why-knowing-bits-and-bytes-is-useful-for-a-javascript-developer">Why knowing bits and bytes is useful for a JavaScript developer</h2>
<p>It is all nice and interesting, but how do you actually apply this knowledge as a JavaScript developer in your work?</p>
<h3 id="heading-bitwise-operations">Bitwise operations</h3>
<p>Bitwise operators are common when it comes to working with:</p>
<ul>
<li><p>Cryptography</p>
</li>
<li><p>Different kinds of 2D and 3D graphics</p>
</li>
<li><p>High-performance operations when dealing with large datasets</p>
</li>
</ul>
<p>They're meant for specific tasks, and you won't use them that often. At the same time, sometimes implementing a feature without using bitwise operators is hardly possible.</p>
<h3 id="heading-understanding-memory-consumption-by-different-language-structures">Understanding memory consumption by different language structures</h3>
<p>Each structure in JavaScript language takes up a specific size in memory. Understanding the bit and byte concept allows you to write memory-efficient programs.</p>
<p>The actual size of any given structure depends on specific JavaScript engine implementations. We'll look at how V8, the most popular engine, allocates memory for some of the commonly used JavaScript structures.</p>
<p><strong>Strings</strong>. Every character in JavaScript is <a target="_blank" href="https://tc39.es/ecma262/multipage/ecmascript-data-types-and-values.html#sec-ecmascript-language-types-string-type">encoded in UTF-16</a>. It means that 2 bytes are required to store a single character. For example, the "hello" string consists of 5 characters and takes 10 bytes or 80 bits of memory.</p>
<p><strong>Numbers</strong>. The allocated memory depends on what <a target="_blank" href="https://web.dev/articles/speed-v8#numbers">type of number</a> you're dealing with. It's either 31-bit signed or 64-bit with double precision point.</p>
<p><strong>Booleans</strong>. For the sake of simplicity, a boolean takes 1 byte.</p>
<h3 id="heading-working-with-files-blobs-and-buffers">Working with files, blobs, and buffers</h3>
<p>Any file on your computer is just a set of bytes. Dealing with a file is the equivalent of dealing with a set of bytes.</p>
<p>It is one of the most common operations in JavaScript. We build applications where users can upload files, download files, and sometimes change files. Change their name and their encoding format, and compress or decompress them.</p>
<p>We usually use blobs and buffers for operations on files. Performing efficient operations over files requires a proper understanding of bits and bytes.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>A bit is the building block of all digital information. It is the smallest unit of information possible, represented by only two numbers, 0 and 1.</p>
<p>A byte is an abstraction created to work with a large number of bits. A single byte is a set of 8 bits.</p>
<p>For JavaScript developers, understanding bits and bytes isn't just theory. It has real, practical uses:</p>
<ul>
<li><p>Bitwise operations for tasks like cryptography</p>
</li>
<li><p>Optimizing how much memory your app uses</p>
</li>
<li><p>Working with files, blobs, and buffers efficiently</p>
</li>
</ul>
<p>In our next article, we'll explore different encodings, showing how bits and bytes come together in Unicode, the most widely used encoding scheme.</p>
]]></content:encoded></item><item><title><![CDATA[Numeric Systems in JavaScript: From Fundamentals to Application]]></title><description><![CDATA[Have you ever stumbled upon cryptic codes like 0x2A or 0b101010 while working with JavaScript? These aren't typos or glitches but alternative ways to represent numeric values. In other words, those are different numeric systems.
Why bother with diffe...]]></description><link>https://pavel-romanov.com/numeric-systems-in-javascript-from-fundamentals-to-application</link><guid isPermaLink="true">https://pavel-romanov.com/numeric-systems-in-javascript-from-fundamentals-to-application</guid><category><![CDATA[Node.js]]></category><category><![CDATA[JavaScript]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Sun, 23 Jun 2024 14:59:27 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1719154640320/1f2d74d2-2662-407b-aff4-0f50383c3332.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Have you ever stumbled upon cryptic codes like 0x2A or 0b101010 while working with JavaScript? These aren't typos or glitches but alternative ways to represent numeric values. In other words, those are different numeric systems.</p>
<p>Why bother with different numeric systems? Different numeric systems help us operate more effectively in areas like cryptography, 3D development, and binary manipulations.</p>
<p>This article dives into the details of these numeric systems, explaining their purpose, usage, and how they can be used in JavaScript.</p>
<h2 id="heading-concept-of-an-abstract-number">Concept of an abstract number</h2>
<p>Before discussing the details of each numeric system, let’s first focus on what numbers are and how we use them.</p>
<p>Imagine that you see three oranges.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXeA2cx-p0_SWDBCaCLuKAxy9AmaAqYDejIQvnrNKkumFesCDh_ORNIjevMTTuaiyCM-u_5DjPgolI_yWS_dnzxt5EBLgLil_peW5ylHnjf5xQ3ng0N21rXMjZPMPSHe_LdmgqWdxyNXptk-E12Gi5fuz1k?key=8-ZTSBwfivwgjBuGnbDVdw" alt class="image--center mx-auto" /></p>
<p>If someone asks you, “How many oranges are there?” Your answer is obvious: three.</p>
<p>Imagine a different scenario. You’re living in Spain and can only speak Spanish. If someone asks you, “How many oranges are there?” in Spanish, you’ll answer: tres.</p>
<p>The abstract concept of 'the number of oranges' remains consistent across languages, as the meaning is the same in both English and Spanish. However, this abstraction is represented differently in each language.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXdkW7nGLHV2a9icV6ZORqfJdMyhZIdhgrQhCzWIikfLPGBBJlb0OWSrmNyzpYOrkV4l7lWOozBDWyOGIaMQqvSHoMH9YOnCEXSmYyEHQQzsUYiXx3-KPTpOwAIPytWpAll66iiIMeOxDe5US81Ij8NyaA?key=8-ZTSBwfivwgjBuGnbDVdw" alt class="image--center mx-auto" /></p>
<p>Different languages use different names for the same numbers. But at least we have the numbers that are common across all languages and always mean the exact same thing, right? Not really.</p>
<p>In addition to different languages, we have different numeric systems. A simple number 3 in a decimal system is represented differently in a binary system. In binary, it is 0011. Similar to English and Spanish, both 3 and 0011 represent the same abstract concept.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXejSkhxgNBk9EmWI3rlvpWKT430SZy1GVgyAHYPmQlzgtqj2PNvz_Jjsay5n0v8Y5HAQy8fUwjgqg1XPF183iDAiE29r_ZcCMyCgWgoRh94zyQwb8pkTGU3zGarfVuI8Q0U_8Pqzjg7H7JkmBm0FScO4sk?key=8-ZTSBwfivwgjBuGnbDVdw" alt /></p>
<p>The reason people use different numeric systems is the same as why they use different words for the same number in different languages. Working and operating with them in certain cases is simply more convenient.</p>
<p>Notice that the previous two pictures don’t capture the essence of the abstract number behind them. A more appropriate representation of an abstract number would be something like this.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXeaBc2rjEf6t2OBh4kUEMUqdlcBD143InGYeUpch6DzLg6hAknVgeenmc0_a-9CNu68GSaQimk0XSlvKuB316JpmIr0FnDHd_-g1VIqfas62PKlnq3zDtxxXMwhXuji7am9xsY0PoJKqnaJog8lVg5gsqU?key=8-ZTSBwfivwgjBuGnbDVdw" alt /></p>
<h2 id="heading-numeric-systems">Numeric systems</h2>
<p>Numeric systems are like different human languages. We use them in certain situations to understand and to be understood.</p>
<p>For example, at the lowest hardware level, computers talk in binary. Using decimals just won’t get you anywhere in the same way as using English to communicate with some Amazon forest tribe.</p>
<p>The good part? Numeric systems are not as complex as languages. It is quite easy to go from one numeric system to another, even using only pen and paper. With the power of a computer, you become a fluent speaker of different numeric systems.</p>
<p>There are a lot of different numeric systems, but the most popular are the following four:</p>
<ul>
<li><p>Binary</p>
</li>
<li><p>Octal</p>
</li>
<li><p>Decimal</p>
</li>
<li><p>Hexadecimal</p>
</li>
</ul>
<p>Likely, you won’t need to use any other numeric systems at all. That’s how popular these four systems are.</p>
<p>Notice, unlike human languages, which are usually tightly bound to a particular place where some group of people lives, numeric systems are all about the “base.” The base is basically how many numbers a particular system has.</p>
<p>For example, a binary system has a base of 2 because it uses only two numbers: 1 and 0. In decimal, we have 10 numbers starting from 0 to 9. The higher the base is, the more numbers there are.</p>
<p>Another interesting thing to mention is that a particular system doesn’t contain a number that is equal to or higher than the base of the system. In binary, we don’t have 2, and octal doesn’t include 9.</p>
<p>Next, we’ll look at each of the four numeric systems in more detail.</p>
<h3 id="heading-binary-system">Binary system</h3>
<p>The binary numeric system is the lingua Franka of computer hardware. It uses only two numbers, 1s and 0s, and surprisingly, it is enough to do the job.</p>
<p>Here is an example of how you count to 5 in binary:</p>
<p>0 0 0 1 - one</p>
<p>0 0 1 0 - two</p>
<p>0 0 1 1 - three</p>
<p>0 1 0 0 - four</p>
<p>0 1 0 1 - five</p>
<p>You might notice the strange zeros that go before the actual number. They are called leading zeros. We use them just to make numbers more readable. Putting any number of zeros at the start of the number does not change the actual value.</p>
<h3 id="heading-octal-system">Octal system</h3>
<p>The octal system was used in the past for computers and programming languages like PDP-8. Although there are not many use cases for it at this point, that doesn’t mean there are none. For example, the Unix file permissions mechanism is based on the octal numeric system.</p>
<p>Here is how you count from 6 to 10 using the octal system:</p>
<p>6 - six</p>
<p>7 - seven</p>
<p>10 - eight</p>
<p>11 - nine</p>
<p>12 - ten</p>
<p>Unlike binary, where we go to 10 straight after 1, it happens after 7 in octal. Remember, a numeric system doesn’t contain a number equal to or higher than the system base. That’s why the octal system doesn’t include the number 8.</p>
<h3 id="heading-decimal-system">Decimal system</h3>
<p>The decimal numeric system is the one we all use daily. It is the most used numeric system in human-to-human interaction. Unlike other popular numeric systems, it is the only one that wasn’t the product of human-to-computer interaction but emerged in human-to-human interaction.</p>
<p>It is not the only one of a kind. Before that, people were using systems like the Roman numeric system or the Sexagesimal (has 60 numbers) numeric system, which was used by the ancient Sumerians, and we’re still using it (remember how many seconds in a minute).</p>
<p>As the name suggests, it operates with ten numbers starting from 0 and up to 9.</p>
<h3 id="heading-hexadecimal-system">Hexadecimal system</h3>
<p>The hexadecimal numeric system (hex) is widely used in software engineering. Here are just a few examples of where you can encounter it:</p>
<ul>
<li><p>CSS colors. One of the most popular color notation in CSS is the hex notation. As an example, FFFFFF represents the black color.</p>
</li>
<li><p>Memory addresses.</p>
</li>
<li><p>UUID.</p>
</li>
<li><p>MAC address.</p>
</li>
<li><p>And, of course, error codes.</p>
</li>
</ul>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXfIyWXY2mMyL0s8D9fzGRHhEWZu4nhh5iZmzwq37GccsH576D2HOPBrzYCaaY4T8iEi5AC9lYBfWI9FFtHdNjDpHN5WP5byah6vtBsOwQBqIhxvrvH4THiU8l9i_Uf3_ZOw6jNbePs70a2XYJw6cPtx1XI?key=8-ZTSBwfivwgjBuGnbDVdw" alt class="image--center mx-auto" /></p>
<p>The hex numeric system includes 16 digits from 0 to F. Here is how you count from 9 to 13 in hex:</p>
<p>9 - nine</p>
<p>A - ten</p>
<p>B - eleven</p>
<p>C - twelve</p>
<p>D - thirteen</p>
<p>Notice that every number from 0 to 15 has a dedicated digit, unlike decimal, where we write ten as 10 instead of A.</p>
<h3 id="heading-working-with-different-numeric-systems-in-javascript">Working with different numeric systems in JavaScript</h3>
<p>After getting familiar with the most popular numeric systems, the next question is, “How do we use them in JavaScript?”</p>
<p>JavaScript provides specific prefixes to make the interpreter understand what numeric system we want to use. Let’s look at how we can write the number 13 in different systems using JavaScript:</p>
<ul>
<li><p>Binary: 0b1101</p>
</li>
<li><p>Octal: 0o15</p>
</li>
<li><p>Decimal: 13</p>
</li>
<li><p>Hexadecimal: 0xD</p>
</li>
</ul>
<p>Notice that for all four numeric systems except decimal, we add a specific prefix starting from 0 and followed by the letter specific to that numeric system. Leading 0 tells the interpreter to treat the value as a number; otherwise, it treats the value as a variable.</p>
<p>If you want to work with different numeric systems, JavaScript supports up to 36 base numeric systems. Be aware that using any non-standard (not binary, octal, decimal, or hexadecimal) numeric system may limit any mathematical operations. The only way to perform math operations with other numeric systems is through conversion to decimal and back. One way to convert a 36-base number to decimal is to use the <code>parseInt</code> function.</p>
<pre><code class="lang-javascript"><span class="hljs-built_in">parseInt</span>(‘ZAE’, <span class="hljs-number">36</span>); <span class="hljs-comment">// results to 45734</span>
</code></pre>
<p>Here is an example of implementing a function to sum a number with a base of 36.</p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">base36ToDecimal</span>(<span class="hljs-params">base36Number</span>) </span>{
  <span class="hljs-keyword">return</span> <span class="hljs-built_in">parseInt</span>(base36Number, <span class="hljs-number">36</span>);
}

<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">decimalToBase36</span>(<span class="hljs-params">decimalNumber</span>) </span>{
  <span class="hljs-keyword">return</span> decimalNumber.toString(<span class="hljs-number">36</span>).toUpperCase(); <span class="hljs-comment">// Ensure uppercase</span>
}

<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">sumBase36</span>(<span class="hljs-params">num1, num2</span>) </span>{
  <span class="hljs-keyword">const</span> decimalSum = base36ToDecimal(num1) + base36ToDecimal(num2);
  <span class="hljs-keyword">return</span> decimalToBase36(decimalSum);
}
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Understanding different numeric systems is a valuable skill for any developer. It goes beyond how computers process numbers; it's about how we, as developers, can communicate and work with data more effectively.</p>
<p>Knowledge of different numeric systems is fundamental and opens a number of opportunities, such as handling specialized data formats, working with low-level programming, buffers, etc.</p>
<p>Next, we'll take a closer look at the building blocks of digital data—bits and bytes—and how you can directly manipulate them using JavaScript.</p>
]]></content:encoded></item><item><title><![CDATA[Optimizing Node.js: Identifying and Fixing Performance Problems with Clinic]]></title><description><![CDATA[Effective tooling is born from a clear intention. If we’re talking about profiling in Node.js, Clinic was intentionally designed to profile Node.js apps.
The experience is completely different from DevtTools. While the developer tools provide a wealt...]]></description><link>https://pavel-romanov.com/optimizing-nodejs-identifying-and-fixing-performance-problems-with-clinic</link><guid isPermaLink="true">https://pavel-romanov.com/optimizing-nodejs-identifying-and-fixing-performance-problems-with-clinic</guid><category><![CDATA[Node.js]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Sun, 16 Jun 2024 14:48:16 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1718549259955/2d3e3398-16df-4062-a9ef-2e231d2133cf.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Effective tooling is born from a clear intention. If we’re talking about profiling in Node.js, <a target="_blank" href="https://clinicjs.org/">Clinic</a> was intentionally designed to profile Node.js apps.</p>
<p>The experience is completely different from <a target="_blank" href="https://pavel-romanov.com/how-to-profile-nodejs-apps-using-chrome-devtools">DevtTools</a>. While the developer tools provide a wealth of information, it is not always presented in a comprehensible way, and it is not always clear where to start.</p>
<p>In this article, you’ll learn how to use Clinic to profile Node.js applications, focusing on three common problems during development: high CPU consumption, memory leaks, and unoptimized asynchronous operations.</p>
<h2 id="heading-setup">Setup</h2>
<p>To see profiling in action, we need some code. For this purpose, I created a <a target="_blank" href="https://github.com/pavel-romanov8/nodejs-profiling-examples">GitHub repository</a> with all the basic scenarios we run into during day-to-day development.</p>
<p>The repository contains an application that starts an HTTP server with three routes. Each route has one specific problem.</p>
<p>In this particular setup, those problems are:</p>
<ul>
<li><p>CPU-intensive task, which blocks the main thread.</p>
</li>
<li><p>Asynchronous operation with a waterfall problem (the execution goes one by one instead of parallel).</p>
</li>
<li><p>Memory leak.</p>
</li>
</ul>
<p>Each route has two implementations. One contains a problem that we should be able to spot with the DevTools, and the other is an optimized version with the same functionality.</p>
<h2 id="heading-profiling">Profiling</h2>
<p>Clinic is a dedicated profiling tool for Node.js, designed with a single goal in mind: to provide the best developer experience (DX) possible when profiling Node.js applications.</p>
<p>To start using Clinic, follow the <a target="_blank" href="https://clinicjs.org/documentation/">getting started guide</a>. Install it locally in your project with:</p>
<pre><code class="lang-bash">npm install clinic
</code></pre>
<p>Or, if you prefer to use Clinic from the CLI, install it globally:</p>
<pre><code class="lang-bash">npm install -g clinic
</code></pre>
<p>Clinic offers 4 profiling tools:</p>
<ul>
<li><p><a target="_blank" href="https://clinicjs.org/doctor/"><strong>Doctor</strong></a>: Provides a high-level overview of the application and its processes.</p>
</li>
<li><p><a target="_blank" href="https://clinicjs.org/bubbleprof/"><strong>Bubbleproof</strong></a>: Troubleshoots asynchronous issues.</p>
</li>
<li><p><a target="_blank" href="https://clinicjs.org/flame/"><strong>Flame</strong></a>: Visualizes CPU usage with flame graphs.</p>
</li>
<li><p><a target="_blank" href="https://clinicjs.org/heapprofiler/"><strong>Heap Profiler</strong></a>: Identifies memory leaks.</p>
</li>
</ul>
<p>We’ll use all four to discover three different types of problems.</p>
<h3 id="heading-cpu-intensive-endpoint">CPU-intensive endpoint</h3>
<p>We start with the CPU-intensive endpoint. Here are the implementations:</p>
<p><em>Solution with high CPU consumption</em></p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runCpuIntensiveTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">fibonacciRecursive</span>(<span class="hljs-params">n</span>) </span>{
   <span class="hljs-keyword">if</span> (n &lt;= <span class="hljs-number">1</span>) {
     <span class="hljs-keyword">return</span> n;
   }
   <span class="hljs-keyword">return</span> fibonacciRecursive(n - <span class="hljs-number">1</span>) + fibonacciRecursive(n - <span class="hljs-number">2</span>);
 }

 fibonacciRecursive(<span class="hljs-number">45</span>);
 cb();
}
</code></pre>
<p><em>Solution with low CPU consumption</em></p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runSmartCpuIntensiveTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">fibonacciIterative</span>(<span class="hljs-params">n</span>) </span>{
   <span class="hljs-keyword">if</span> (n &lt;= <span class="hljs-number">1</span>) {
     <span class="hljs-keyword">return</span> n;
   }

   <span class="hljs-keyword">let</span> prev = <span class="hljs-number">0</span>, curr = <span class="hljs-number">1</span>;

   <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">2</span>; i &lt;= n; i++) {
     <span class="hljs-keyword">const</span> next = prev + curr;
     prev = curr;
     curr = next;
   }

   <span class="hljs-keyword">return</span> curr;
 }

 fibonacciIterative(<span class="hljs-number">45</span>)
 cb();
}
</code></pre>
<p>Both versions calculate the 45th Fibonacci number. The first implementation uses a recursive, CPU-intensive approach, while the second one employs an iterative, more efficient approach.</p>
<p>The best way to start a profiling session with Clinic is to use Doctor. Doctor provides an overview of the application and its processes and can redirect you to more specific tools like Bubbleproof if needed.</p>
<p>Here’s how to use Doctor for profiling:</p>
<pre><code class="lang-bash">clinic doctor – node app.js
</code></pre>
<p>This command starts a Node.js application from the specified file (in this case, app.js). It behaves like a regular application with one exception: Doctor monitors the application and related resources.</p>
<p>You’ll see a similar message in the console:</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXdwyKteRODiYsR1bce29kPaLLaBqizENCRxajsJV4JFl50oSG_Pn1lTrO_9Jr_xefskzlcrqQKCmYkezseMW4DNx1KkISC9Iv1WgAicVvdXdjOvZLw-5ccOxGR6lMX15dpGu5_kXz99wtquQdc-wcfY3RE?key=ZALWQZs3M4rmT9XBqEqxOw" alt class="image--center mx-auto" /></p>
<p>Notice that in order to generate the final profiling report, you have to terminate the running process.</p>
<p>After calling the CPU-intensive endpoint, terminate the process. There will be a dashboard similar to this one.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXd9q048tGhuv4Kbef8z6GwOb2KtF9TGTRTXJyDwle1lAKjnM0buRMso5mWqdIisXOkCvoB7hAS3TDM2yitGCeHJ6tnJp-hWTBMjUCCcLFHPCtPAruOCIr9PJ3XUEzeY8dVVHGERsOL8yq6BNkEMifa5EmI?key=ZALWQZs3M4rmT9XBqEqxOw" alt class="image--center mx-auto" /></p>
<p>It contains an alert at the very top of the page.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXc5PC1OfYh9EaSFwMXtxTYE6Jt5K8TrINae2_mLntpmnT41Wk2g9CEYA-1Mn_9h_-FHIco5N8ODs-x-eCfgV6dONfOJwZXHwNFPiWrZY5UE8kyNzh_AYqTqfv6qL0jwd0Jb1yEdAUqix0tubAIGtI-EAro?key=ZALWQZs3M4rmT9XBqEqxOw" alt class="image--center mx-auto" /></p>
<p>This message clearly shows that we have problems with the CPU and event loop. Indeed, the charts related to these two resources indicate exactly the same thing.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXe6CSKoSA2Wj7xhERGMi6ZarIEFw-R4u3yWLPis9vKKSkeTnQZUCk4k6nSDg_j-qtT4o5HgfP0l_mOqXltWp2INYFHg6wKTNOIbGGeD-CGw5kNssq45auMOKmDvyL0q-dmvCLKOYsqT2J4bce49inqPX24?key=ZALWQZs3M4rmT9XBqEqxOw" alt class="image--center mx-auto" /></p>
<p>At the very bottom, there is a recommendations section. You can click on it to see detailed, human-readable recommendations on what this problem is about and what you can do next.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXfuyciwvtgH__cmGVQxSIk7_vE5_YvAb6RgiIxCzQPMbol5U-KCoqUkokAcYHa1jW1dzOfBDSSIOvijkRiYvSeV3ldHW-KMiUX3HB_6sNHnqHLjaDRK6H88A7-YiZbiPcW88tcOfN-_P-CK0CNEpnF5jA?key=ZALWQZs3M4rmT9XBqEqxOw" alt class="image--center mx-auto" /></p>
<p>Doctor says there are some CPU-related problems. It also redirects you to Flame or Bubbleprof tools for further analysis. Let’s do as Doctor suggests and use Falme.</p>
<pre><code class="lang-bash">clinic flame – node app.js
</code></pre>
<p>Sending a request to the same CPU endpoint and terminating the process generates the flame graph.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXcr5Mb7Hx_-ikXPdSY_EvUzKgtNgoOw_mIHpq1o_nVnrn1A8_5mH_KBi0TPYckNdGY-TAfV_2K3eacXo9h-lwENKZMYK3BbixbNLpMczAotft_z7C-10hzvcch-lVxxETJK2jnYrwyvifnJ3pT_bS3MjZI?key=ZALWQZs3M4rmT9XBqEqxOw" alt class="image--center mx-auto" /></p>
<p>From this graph, it is evident that fibonnaciRecursive takes the most time.</p>
<p>After changing the endpoint implementation to a more effective one, we run the same Flame tool again. The results are drastically different.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXdo8B_pm8_5zOEi7v6YL9cSZXhL4dDLL-Wocrb5CJwTxojp-HGqPUJwT-OLmYriNRBdw3wQwxZfe2L7AGV1a07lhl1RLU9LAsdu8C_dAtMGiWlfzDjXjfx14a9UTXjjPO__CDc7T4gQZgJtNmqmjcvANQ?key=ZALWQZs3M4rmT9XBqEqxOw" alt class="image--center mx-auto" /></p>
<p>Let’s see what Doctor says.</p>
<p>The picture is not even close in terms of CPU usage and event loop delay.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXePeQ5BW14MhNZ3nvbopTKXqPmi8e3d766wtz2BgoiGmeA7SyUmhw_qehOhouxCu15Jr0wYEM-J1RcGlJ62SK-1BZQHVo5jUoe7cRT2cIPM5HCxlEpVs_JJuNh7LoNQa33zS6cPKJ4IeJ6NZ26WDLCkdg?key=ZALWQZs3M4rmT9XBqEqxOw" alt class="image--center mx-auto" /></p>
<h3 id="heading-async-endpoint">Async endpoint</h3>
<p>Next, let's look at the async endpoint. Here are both implementations of the endpoint:</p>
<p><em>Solution with async operations waterfall</em></p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">generateAsyncOperation</span>(<span class="hljs-params"></span>) </span>{
 <span class="hljs-keyword">return</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Promise</span>(<span class="hljs-function"><span class="hljs-params">resolve</span> =&gt;</span> {
   <span class="hljs-built_in">setTimeout</span>(<span class="hljs-function">() =&gt;</span> {

     <span class="hljs-comment">// Simulate heavy async operation</span>
     <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>; i &lt; <span class="hljs-number">50000000</span>; i++) { }
     resolve();
   }, <span class="hljs-number">1000</span>);
 });
}

<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runAsyncTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-keyword">await</span> generateAsyncOperation();
 <span class="hljs-keyword">await</span> generateAsyncOperation();
 <span class="hljs-keyword">await</span> generateAsyncOperation();
 cb();
}
</code></pre>
<p><em>Solution without async operations waterfall</em></p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">generateAsyncOperation</span>(<span class="hljs-params"></span>) </span>{
 <span class="hljs-keyword">return</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Promise</span>(<span class="hljs-function"><span class="hljs-params">resolve</span> =&gt;</span> {
   <span class="hljs-built_in">setTimeout</span>(<span class="hljs-function">() =&gt;</span> {

     <span class="hljs-comment">// Simulate heavy async operation</span>
     <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>; i &lt; <span class="hljs-number">50000000</span>; i++) { }
     resolve();
   }, <span class="hljs-number">1000</span>);
 });
}

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runSmartAsyncTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-keyword">await</span> <span class="hljs-built_in">Promise</span>.all(<span class="hljs-keyword">new</span> <span class="hljs-built_in">Array</span>(<span class="hljs-number">3</span>).fill()
   .map(<span class="hljs-function">() =&gt;</span> generateAsyncOperation()));
 cb();
}
</code></pre>
<p>We’ll start profiling the async endpoint with Doctor.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXe5xvgjVq2AD6zSxZqzq1A2dEis_4rR-LyxEmHiIlhrfstgnSpb8wnVzN38RI5g3fStiB1H5xuowhCGg40Yv3m72GaWstijLNRYTVFCLO5ukGQ5ehY7B1VVEkbUcd6HOnIQPjvv6xPFs9_PwybztuRMip8?key=ZALWQZs3M4rmT9XBqEqxOw" alt class="image--center mx-auto" /></p>
<p>Although the graphs look fine, an error message suggests further investigation. Let’s delve into the details.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXdIWqvCyAGMgkk8v5V54n1-vjt4BwyCAGabzzc16cVdEyu5Yoy82O2eytLOCIxIamK60ofk2kuy7-tii3S4ETU3Hc-n7O0-AFXFdlQ0z9ToPlTpsYz-GMzY6jsuYulVE5N3AVdPGmCfbWJm74g-dLOxpko?key=ZALWQZs3M4rmT9XBqEqxOw" alt class="image--center mx-auto" /></p>
<p>Doctor recommends using Bubbleproof to troubleshoot async issues further. Let’s follow that advice.</p>
<pre><code class="lang-javascript">clinic bubbleproof – node app.js
</code></pre>
<p>Running the same async endpoint generates a report showing three consecutive operations. Each purple line represents an async operation, with additional details available for exploration.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXdHjemJPXGW6vwkirLUp5_2OZyLEo6E_59U7F-ibVYmuHhSOD6OKNAqXxoZZShFQBXHStRzxwSssvVDst6q5ebIvo56ypbUa8w4V6nq8Wa2pVkHilFPV9aGRZAPLA7uEZo37ubITZHIcWxKpWnwF8sYt2g?key=ZALWQZs3M4rmT9XBqEqxOw" alt class="image--center mx-auto" /></p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXeLhGOQ8wal80An7xCQXbckUykxtnouq1ZiemT48YbPJXoqDZg3twqIl8TKEFtUsARV0uEOrjUF5w_yodAZXsG_CLorQznJ5vo6MPxIGasLRwofegNkrh5MgPPC9ITyhmmKegfSV7XUe8bHhh-9mrt_lko?key=ZALWQZs3M4rmT9XBqEqxOw" alt /></p>
<p>This visualization clearly illustrates the waterfall problem in async operations. Now, let's switch to the optimized route handler. We'll replace <code>runAsyncTask</code> with <code>runSmartAsyncTask</code> and rerun the workflow.</p>
<p>The new graph looks much better, showing only a single call to the async operation. This is exactly the result we want.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXfCbwQB-xGPVcrcEiOWrMtdlzpDLwj7OO9L6R1jILawEAU7qx8J7nUZM6OB-OmhcIsz8RV-PKcs6P3ySyl5Kx0wdkE3nGOgaHi_ykGyq5tK6qr6Puu7H_DgUVAQ6mYFYh6YRXvCJqGHBnUAfBjQNXBeb0o?key=ZALWQZs3M4rmT9XBqEqxOw" alt /></p>
<p>The optimized graph now contains only a single call to the async operation, eliminating the inefficiencies of the waterfall problem.</p>
<h3 id="heading-memory-leak-endpoint">Memory leak endpoint</h3>
<p>Here's the code for both scenarios:</p>
<p><em>Solution with memory leak</em></p>
<pre><code class="lang-javascript"><span class="hljs-keyword">const</span> memoryLeak = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Map</span>();

<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runMemoryLeakTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>; i &lt; <span class="hljs-number">10000</span>; i++) {
   <span class="hljs-keyword">const</span> person = {
     <span class="hljs-attr">name</span>: <span class="hljs-string">`Person number <span class="hljs-subst">${i}</span>`</span>,
     <span class="hljs-attr">age</span>: i,
   };

   memoryLeak.set(person, <span class="hljs-string">`I am a person number <span class="hljs-subst">${i}</span>`</span>);
 }

 cb();
}
</code></pre>
<p><em>Solution without memory leak</em></p>
<pre><code class="lang-javascript"><span class="hljs-keyword">const</span> smartMemoryLeak = <span class="hljs-keyword">new</span> <span class="hljs-built_in">WeakMap</span>();

<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runSmartMemoryLeakTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>; i &lt; <span class="hljs-number">10000</span>; i++) {
   <span class="hljs-keyword">const</span> person = {
     <span class="hljs-attr">name</span>: <span class="hljs-string">`Person number <span class="hljs-subst">${i}</span>`</span>,
     <span class="hljs-attr">age</span>: i,
   };

   smartMemoryLeak.set(person, <span class="hljs-string">`I am a person number <span class="hljs-subst">${i}</span>`</span>);
 }

 cb();
}
</code></pre>
<p>In the first function, <code>runMemoryLeakTask</code>, we use a <code>Map</code> to store objects, which prevents them from being garbage collected, causing a memory leak.</p>
<p>In the second function, <code>runSmartMemoryLeakTask</code>, we use a <code>WeakMap</code>, which allows garbage collection of the keys when they are no longer referenced elsewhere.</p>
<p>We need a different approach for memory profiling because Clinic Doctor doesn’t provide enough information in this case. Instead, we’ll use the heap profiler right away.</p>
<p>Here’s how to run the heap profiler:</p>
<pre><code class="lang-javascript">clinic heapprofiler – node app.js
</code></pre>
<p>After profiling the unoptimized memory leak endpoint, we see the following:</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXdY3BRXCSAY3klPOLxnhgi9n4IYKhfaS52VBXrpJK05dHinyczVP3TutFHO4pGCosAknuiEzv2llrdmH5LxGwAut8ibO6Z5yHzojJiNHqcJeRy-FcW0BUu7mPN6Id6aDcrdTQswaMXVIFPymuGxCsNVNA?key=ZALWQZs3M4rmT9XBqEqxOw" alt class="image--center mx-auto" /></p>
<p>The function with a memory leak consumes more than half of the allocated memory. This is easy to spot due to the large space it occupies in the profiler.</p>
<p>Now, let’s run the optimized function and see the difference.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXdL7eZqp5gZAEfdmeqxFGkDftAOVCji7V8GtMiYxr3IFOjceWhf2zvJXc4XjcGOa_dz_VVMRyTVfYPTbvpDyCkckTJFYquubbBKPmf46UeDVrqXPx-74XCWd1Ya5qChqwG8QxZNfLJNMHIthZmrHmUHV1o?key=ZALWQZs3M4rmT9XBqEqxOw" alt class="image--center mx-auto" /></p>
<p>The result is clear: the memory leak has been resolved, as the large memory-consuming function is no longer present.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Clinic is a powerful tool for profiling Node.js applications, offering clear and human-readable diagnostics through Doctor.</p>
<p>Its dedicated tools for async, I/O, and memory profiling make it invaluable for addressing various performance issues. By using Clinic, you can efficiently identify and resolve problems, ensuring your applications run smoothly and effectively.</p>
]]></content:encoded></item><item><title><![CDATA[How to Profile Node.js Apps Using Chrome DevTools]]></title><description><![CDATA[I was confused when I heard you can use Chrome DevTools to profile Node.js applications. Why are we using browser tooling for Node.js applications? But it makes perfect sense, and it is one of the best options to do profiling.
Why? Because both Chrom...]]></description><link>https://pavel-romanov.com/how-to-profile-nodejs-apps-using-chrome-devtools</link><guid isPermaLink="true">https://pavel-romanov.com/how-to-profile-nodejs-apps-using-chrome-devtools</guid><category><![CDATA[Node.js]]></category><category><![CDATA[devtools]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Sun, 09 Jun 2024 07:49:54 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1718032488928/5d997c11-d0d1-4e4b-86c6-6676e932b7d4.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I was confused when I heard you can use Chrome DevTools to profile Node.js applications. Why are we using browser tooling for Node.js applications? But it makes perfect sense, and it is one of the best options to do profiling.</p>
<p>Why? Because both Chrome and Node.js use V8, the DevTools are designed to work with V8, its memory heap, performance metrics, and more. From this perspective, it looks like a shiny start.</p>
<p>In this article, you’ll learn how to use DevTools to profile Node.js applications, focusing on three common problems during development: high CPU consumption, memory leaks, and unoptimized asynchronous operations.</p>
<h2 id="heading-setup">Setup</h2>
<p>To see profiling in action, we need some code. For this purpose, I created a <a target="_blank" href="https://github.com/pavel-romanov8/nodejs-profiling-examples">GitHub repositor</a>y with all the basic scenarios we run into during day-to-day development.</p>
<p>The repository contains an application that starts an HTTP server with three routes. Each route has one specific problem.</p>
<p>In this particular setup, those problems are:</p>
<ul>
<li><p>CPU-intensive task, which blocks the main thread.</p>
</li>
<li><p>Asynchronous operation with a waterfall problem (the execution goes one by one instead of parallel).</p>
</li>
<li><p>Memory leak.</p>
</li>
</ul>
<p>Each route has two implementations. One contains a problem that we should be able to spot with the DevTools, and the other is an optimized version with the same functionality.</p>
<h2 id="heading-profiling">Profiling</h2>
<p>To start using DevTools with Node.js, we must run our server using the Node.js <a target="_blank" href="https://nodejs.org/api/inspector.html#cpu-profiler">inspector</a>.</p>
<pre><code class="lang-bash">node --inspect app.js
</code></pre>
<p>In short, the inspector provides the ability to interact with the V8 inspector.</p>
<p>After that, type <code>chrome://inspect</code> in your browser's search bar. You should see the following page:</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXe22-io2kVh9IJ4eOL2T579u2UvGoqGpPOGxTPOdCHYR7lheckpqrgZ0BGi-9lphazNyVxLzIId3OYh8s7_x1xuKgA8GB4juuEdNNax0UwI8aRs7MAhtpUYF8T4AdulbnVA1vi-qB3qsQfcV_YfblWAQvY?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>Click the “Open dedicated DevTools for Node” button. You’ll see the DevTools connected to the node process you’re running.</p>
<p>To measure the performance of the connected Node.js app, you go to the “Performance” tab and click the “Record” button.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXcqYo_nym8vYGfUThGydBfeVnkwE66Qut_8IOwaY8K2m7zcJz6epplj5D6AGujUSImso51moKpSYfQdkfmE29SIMIkCUrRRJ0cWw-udx8G7rZ-F8t525GXBLaB3rXz7u4-J8uFht-SI_qxN4RY1Bze00ZM?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>After that, you’ll see the dialog indicating profiling status. You can stop it at any time by clicking the “Stop” button.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXd8mYEaKomWGWlLctSAz9sqXJSydz2LnFetsQ4MKIS1aXTWOVaValPsnnBklBTfEjCTBG8Wp6LSeDJotQQEPlLat1udm1b5Mp9RKumkU2b013xdVzIbjwpKRIsJd75gBxUJz-5aK6Uio3q5dznANk2wChI?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>Now, let’s see how it works with the prepared endpoints.</p>
<h3 id="heading-cpu-intensive-endpoint">CPU-intensive endpoint</h3>
<p>We start with the CPU-intensive endpoint. Here is what both implementations of the endpoint look like.</p>
<p><em>Solution with high CPU consumption.</em></p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runCpuIntensiveTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">fibonacciRecursive</span>(<span class="hljs-params">n</span>) </span>{
   <span class="hljs-keyword">if</span> (n &lt;= <span class="hljs-number">1</span>) {
     <span class="hljs-keyword">return</span> n;
   }
   <span class="hljs-keyword">return</span> fibonacciRecursive(n - <span class="hljs-number">1</span>) + fibonacciRecursive(n - <span class="hljs-number">2</span>);
 }
 fibonacciRecursive(<span class="hljs-number">45</span>);
 cb();
}
</code></pre>
<p><em>Solution with low CPU consumption.</em></p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runSmartCpuIntensiveTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">fibonacciIterative</span>(<span class="hljs-params">n</span>) </span>{
   <span class="hljs-keyword">if</span> (n &lt;= <span class="hljs-number">1</span>) {
     <span class="hljs-keyword">return</span> n;
   }
   <span class="hljs-keyword">let</span> prev = <span class="hljs-number">0</span>, curr = <span class="hljs-number">1</span>;
   <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">2</span>; i &lt;= n; i++) {
     <span class="hljs-keyword">const</span> next = prev + curr;
     prev = curr;
     curr = next;
   }
   <span class="hljs-keyword">return</span> curr;
 }
 fibonacciIterative(<span class="hljs-number">45</span>)
 cb();
}
</code></pre>
<p>Both versions calculate the 45th Fibonacci number. The first implementation uses recursion, and the second one employs the iterative approach.</p>
<p>After running the CPU-intensive endpoint with the implementation that consumes a lot of CPU, you see the following picture.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXdtWrRlmdYlmviiBXA5G2HQjUAqbhNhy21ZOdK_EFRfryZa8X5GJGQDOplChED8mJDgHBvTrxHyf43sEUmBwdUNuM5qcIDReu_e08QXuQVz0_NTDCA-Qd2uGTZLcEDh6KByDasZczCpZmzBwmePRb6xmA?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>The breakdown of activities.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXec5aSfaJZc75wKlXXTh5QLgNc0-PSLRFtW9n9t4vOHEJ2MI8Ydo9_rZ-PHKnuhhVniznDtffizDc5JoKTuru6Vox2hkBJNCXSszP8Y3ziXqIFBb5aIOx8V-4nV3ViDPFAyNsfuzGj6_w07MtRm626RIas?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>It is clear that the <code>fibonacciRecursive</code> function takes more than half of the execution time. This means that more than half of the time, the application exclusively works for the single function of a single request. This should be optimized.</p>
<p>The picture radically differs when we use the improved version with the same functionality.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXdqAeMldnxenMbsjYdmyyUPz2fMbPb0nCuDYVZ3aL6tkQtOTH5Z2YRnheVcVCx1MElpS-Qgp-RaFKQtuNTGfdalvOTq_1lZACfqdkINYDEWW7s7YqLxgXSn_AcCDt84Q9Uc1DuTP0NtG0LRKRLngrNNu9c?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>It is barely noticeable. In the activity tab, we have the following picture.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXdLLZnN6ONK7dQNtBSCc4hX8_3EvnbaPEC6CMh2VRBUKGfrUweMNGVGKEM3piAwmWZ4auOgWbxLNmX1nFexAQeKms_7xRoqnO2GtQxuCPC_GUxFggedSs_k-GK-sgnfLiFzJ6lLihOwpWgKa2gopLMGXi8?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>The difference is enormous. We no longer block the main thread and consume way less CPU resources than before. It's definitely a win.</p>
<h3 id="heading-async-endpoint">Async endpoint</h3>
<p>Next on the list in the async endpoint. Here is what both implementations of the endpoint look like.</p>
<p><em>Solution with the waterfall effect.</em></p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">generateAsyncOperation</span>(<span class="hljs-params"></span>) </span>{
 <span class="hljs-keyword">return</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Promise</span>(<span class="hljs-function"><span class="hljs-params">resolve</span> =&gt;</span> {
   <span class="hljs-built_in">setTimeout</span>(<span class="hljs-function">() =&gt;</span> {
     <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>; i &lt; <span class="hljs-number">50000000</span>; i++) { }
     resolve();
   }, <span class="hljs-number">1000</span>);
 });
}

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runAsyncTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-keyword">await</span> generateAsyncOperation();
 <span class="hljs-keyword">await</span> generateAsyncOperation();
 <span class="hljs-keyword">await</span> generateAsyncOperation();
 cb();
}
</code></pre>
<p><em>Solution with parallel execution.</em></p>
<pre><code class="lang-javascript"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">generateAsyncOperation</span>(<span class="hljs-params"></span>) </span>{
 <span class="hljs-keyword">return</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Promise</span>(<span class="hljs-function"><span class="hljs-params">resolve</span> =&gt;</span> {
   <span class="hljs-built_in">setTimeout</span>(<span class="hljs-function">() =&gt;</span> {
     <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>; i &lt; <span class="hljs-number">50000000</span>; i++) { }
     resolve();
   }, <span class="hljs-number">1000</span>);
 });
}

<span class="hljs-keyword">export</span> <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runSmartAsyncTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-keyword">await</span> <span class="hljs-built_in">Promise</span>.all(
   <span class="hljs-keyword">new</span> <span class="hljs-built_in">Array</span>(<span class="hljs-number">3</span>).fill().map(<span class="hljs-function">() =&gt;</span> generateAsyncOperation())
 );
 cb();
}
</code></pre>
<p>The result of the waterfall async function is that you see three spikes with intervals around 1000ms or one second. This means a function runs only after the previous one has finished running.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXeHDe4X7Ee2wMQ7roRhN-5u8Np9osT0qJNDUwJVj_eapoGvnEbulKHtxTPsOX-VOQ4_oXZwhSy8keX-aAjZNJyjzucxetLg7ArJ93GAwvA_9V7ORA0P-1hzNCiyoaPc8CP6gvpMFhdu8hohowcyb9Z-vXc?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>Since those are all async operations, we’re interested in a total execution time rather than non-blocking behavior. Each async request took around 1 second, and because of the waterfall effect, the function took 3 seconds to finish.</p>
<p>With the improved version, we have the following picture.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXc64zhqvA9-yttzH1PzeaGrPUfnT7L88r7E5iBBbbC-XjsMAE0TbATGTzgdjEVK1qxIXAVFCtKWH7i4yA5TrTQg2G8LjxEHdTQ1NSTU55l_xwNzyV9pC1j9Ps3uh3FkYyWna47lM6wF9-N7_QRy3Z0g3XY?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>There was only one spike. Processing those three requests took the same time, but they are now running in parallel rather than one by one.</p>
<h3 id="heading-memory-leak-endpoint">Memory leak endpoint</h3>
<p>Here is what the code for both cases looks like.</p>
<p><em>Solution with a memory leak.</em></p>
<pre><code class="lang-javascript"><span class="hljs-keyword">const</span> memoryLeak = <span class="hljs-keyword">new</span> <span class="hljs-built_in">Map</span>();

<span class="hljs-keyword">export</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runMemoryLeakTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>; i &lt; <span class="hljs-number">10000</span>; i++) {
   <span class="hljs-keyword">const</span> person = {
     <span class="hljs-attr">name</span>: <span class="hljs-string">`Person number <span class="hljs-subst">${i}</span>`</span>,
     <span class="hljs-attr">age</span>: i,
   };
   memoryLeak.set(person, <span class="hljs-string">`I am a person number <span class="hljs-subst">${i}</span>`</span>);
 }
 cb();
}
</code></pre>
<p><em>Solution without a memory leak.</em></p>
<pre><code class="lang-javascript"><span class="hljs-keyword">const</span> smartMemoryLeak = <span class="hljs-keyword">new</span> <span class="hljs-built_in">WeakMap</span>();

<span class="hljs-keyword">export</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runSmartMemoryLeakTask</span>(<span class="hljs-params">cb</span>) </span>{
 <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>; i &lt; <span class="hljs-number">10000</span>; i++) {
   <span class="hljs-keyword">const</span> person = {
     <span class="hljs-attr">name</span>: <span class="hljs-string">`Person number <span class="hljs-subst">${i}</span>`</span>,
     <span class="hljs-attr">age</span>: i,
   };
   smartMemoryLeak.set(person, <span class="hljs-string">`I am a person number <span class="hljs-subst">${i}</span>`</span>);
 }
 cb();
}
</code></pre>
<p>All previous measurements were made using the performance tab inside DevTools. The memory tab also allows us to measure memory consumption.</p>
<p>You can find it right next to the performance tab.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXdeXk8mjZ0GgJgxum2KYAezfM8YbdFgCDyVSPxUGpyvNpE5ci1O_EUs9lAo-TaZQ1LK72d1ElQF9XCIjWy1QvhwiL00-myAYK3YaOOmhAGPGR2XSfi2G_55PKh_vTR4c5JQCUltiU2nGYgVsW2cg6xjBQ?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>There are three different approaches to this profiling.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXfkt3d_djrq3piXpO1hozLTVbd0vEY0Gov-ae8R5XFjZjl5Z0WeQXSIYixXDOAIPmA2vvDF0WFTtEx4FgHawxo0VSYvl2cnHoCDPm5Rr8SWrOTTNFbZjUhZ1Z6ihmg1NYNKD_nPgzsCioODYJyOlKs-vw?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>We’ll choose the second option, “Allocation instrumentation on timeline.” That way, you’ll be able to see the memory consumption timeline, which is similar to the performance tab.</p>
<p>To start the profiling session, click the profile button at the top left corner.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXe3MJR9sS8sUSai0_82lQDzUs8PQsStApBY6TnEpqXU7nPLb7HiRe1DvEaYNqweCiWozARtbD3DnGiywMMkdtENPRTET3sMYMUIGnGWEOIPtVtE4XtftBmrk30vzkVrigcEIxGMEG7VuLNTunchn7VanKo?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>First, we’ll send four requests to the memory leak endpoint with unoptimized implementation and record them using the profiler. You can clearly see four memory spikes on the timeline afterward.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXcNuZ2sK32zJUgCGDa-Y6WFbHlZCcO8eg_wspmLJLWlhGi7i4Yfz7Z4nkfNiNq1tVd9gnudqXQkSrGv6fObFtUzvdQlSaBWrWueqJ-KZW1g45-NKLk-4R_dToOpJhK9aOfiTMleXmnWyHdsnaLW9PCQpyY?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>The total size of the program is 11.5MB.</p>
<p>Below the timeline, you’ll see a table representing what is taking up the space.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXeZbs6y8-m9OQRCW_BVKW588w3sBW3b5LuGlMU5v9zEc-Fcu2Ga12NxYMdYEuN8fyD1cpbvbCvWYsJH4yHiUv0TYrbuqwvHxxg7BHNothhSDRfK3DMBqEYbqiFwepe2rCtD6yAimt2c2xwJe7cYcxDHRAU?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>While it is clear we have a lot of things going on here, like ~40,000 objects, it isn’t clear where they originate from. To find this connection, we’ll go back to the timeline and click the following selection.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXfljB9WrcFqzABwutv-dLWn2eAAmGg6peP-EMkyUUuAPiZtqMORIO1etPynog7cmE60YwAfB85FVtWC9Cj0p_IGM94FDXAs88xkXvxSfIJxX9ZoQQhV50xgmC62AzvKHAfrVwKVq1MRCc2ku1DrD9ki7Aw?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>You’ll see the “Allocation” option in the list. Choose it. Now you see the root of the problem.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXcteZJgtSZhwjDHswr98JF5Z252Z_a3e4vkjvWo5Isznbe_PkyQvqN7ZpM2O4YK0XcIV_L624HFv3a5EriR3pQ23j4o_aAAADtgZ9jCItL7sH4Qx-yvfl3Gu-uxhm0_zQnFFEU92bQNbtSTW9-lyiy6bD4?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>It is the <code>runMemoryLeakTask</code> function that occupies most of the space. Let’s change it and run the optimized version instead with the same four requests. The result is impressive.</p>
<p><img src="https://lh7-us.googleusercontent.com/docsz/AD_4nXd_0ShiLhqhS0VodmRUDxNkiyDMLC8In2aknX1cC8Et5YBBELOhn8wGqvZIJPAZXT09m4n8QZ1gbVBIfICJTiCOR3ypaBGF5TmmVNiJsl-SiH6DFe8spsDNNd_3uHs4EZZSpOOLCZyR2u_VDt5CopZX86g?key=BzeRpvGpdlcwZjED5rPT0A" alt class="image--center mx-auto" /></p>
<p>You can barely see requests except for the first one. The overall program size is reduced from 11.6 MB to 4.8 MB. It is ~2.5x less memory consumption.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>The DevTools are amazing. They give you a wealth of information about CPU usage, detailed memory heap allocation, network communication, and much more.</p>
<p>While it provides everything you need in terms of information about everything that is going on inside the application, wouldn’t it be better if someone did the initial job of interpreting those results and giving you more human-readable instructions on what to do next?</p>
<p>That is what <a target="_blank" href="http://clinic.js">Clinic.js</a> is all about. We’ll review it in the <a target="_blank" href="https://pavel-romanov.com/optimizing-nodejs-identifying-and-fixing-performance-problems-with-clinic?showSharer=true">upcoming article</a>.</p>
]]></content:encoded></item><item><title><![CDATA[Node.js Performance Hooks: Mastering the Mental Model]]></title><description><![CDATA[Node.js includes a built-in module called performance hooks for precise performance measurement. But why use it when you can simply log timestamps and calculate the difference between two dates?
At least because it is precise. The module uses a monot...]]></description><link>https://pavel-romanov.com/nodejs-performance-hooks-mastering-the-mental-model</link><guid isPermaLink="true">https://pavel-romanov.com/nodejs-performance-hooks-mastering-the-mental-model</guid><category><![CDATA[Node.js]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Sun, 26 May 2024 08:46:05 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1716712816941/dac110f5-d996-42b5-8d7d-f2f1deac1136.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Node.js includes a built-in module called performance hooks for precise performance measurement. But why use it when you can simply log timestamps and calculate the difference between two dates?</p>
<p>At least because it is <strong>precise</strong>. The module uses a monotonic clock that allows you, as a user, to make performance measurements and be sure that they are not corrupted.</p>
<p>At first, I struggled to understand how it works. It is an abstraction, and as with any abstraction, you need to put in extra effort to understand it. Existing materials didn’t help much with it.</p>
<p>The key to understanding performance hooks lies in understanding its underlying mental model. This article will provide an overview of the module's core concepts and a detailed explanation of how they relate to each other. By the end, you'll know how the performance hooks work.</p>
<h2 id="heading-mental-model">Mental model</h2>
<p>It’s important to understand the underlying mental model to use performance hooks well. This section will explain the basic concepts behind performance hooks. With this knowledge, you can use them more effectively.</p>
<h3 id="heading-clocks">Clocks</h3>
<p>Let's start by exploring the different types of clocks used in performance measurement. In this context, a "clock" is an abstract representation of how we perceive time.</p>
<p>The first type is the <strong>wall clock</strong>. The W3C <a target="_blank" href="https://w3c.github.io/hr-time/#wall-clock-unsafe-current-time">specification defines</a> it as follows:</p>
<blockquote>
<p><em>The wall clock's unsafe current time is always as close as possible to a user's notion of time. Since a computer sometimes runs slow or fast or loses track of time, its</em> <a target="_blank" href="https://w3c.github.io/hr-time/#dfn-wall-clock"><em>wall clock</em></a> <em>sometimes needs to be adjusted, which means the</em> <a target="_blank" href="https://w3c.github.io/hr-time/#wall-clock-unsafe-current-time"><em>unsafe current time</em></a> <em>can decrease, making it unreliable for performance measurement or recording the orders of events. The web platform shares a</em> <a target="_blank" href="https://w3c.github.io/hr-time/#dfn-wall-clock"><em>wall clock</em></a> <em>with [</em><a target="_blank" href="https://w3c.github.io/hr-time/#bib-ecma-262"><em>ECMA-262</em></a><em>]</em> <a target="_blank" href="https://tc39.es/ecma262/multipage/#sec-time-values-and-time-range"><em>time</em></a><em>.</em></p>
</blockquote>
<p>In essence, the wall clock aligns with a user's perception of time, including system time adjustments and time zones. However, it operates independently of any specific process or user. Even if a JavaScript program using the wall clock stops, it continues to run.</p>
<p>The second type is the <strong>monotonic clock</strong>. The <a target="_blank" href="https://w3c.github.io/hr-time/#dfn-monotonic-clock">documentation describes</a> it as:</p>
<blockquote>
<p><em>The monotonic clock's unsafe current time never decreases, so it can't be changed by system clock adjustments. The</em> <a target="_blank" href="https://w3c.github.io/hr-time/#dfn-monotonic-clock"><em>monotonic clock</em></a> <em>only exists within a single execution of the</em> <a target="_blank" href="https://infra.spec.whatwg.org/#user-agent"><em>user agent</em></a><em>, so it can't be used to compare events that might happen in different executions.</em></p>
</blockquote>
<p>Unlike the wall clock, the monotonic clock doesn't adjust to a current user. It exists only within a specific context, such as a single execution of a Node.js process.</p>
<p>The performance hooks module uses the monotonic clock. Why?</p>
<p>Because measurement accuracy is important, a wall clock is prone to user-specific time changes, like system time adjustments, especially during performance measurements.</p>
<p>Additionally, the monotonic clock offers higher precision than the wall clock, making it ideal for performance measurement.</p>
<h3 id="heading-performance-timeline">Performance Timeline</h3>
<p>The concept of a Performance Timeline often confuses people due to a lack of clear explanations. Let’s end it right here and clarify some things about the topic.</p>
<p>In this context, a timeline is simply a sequence of events occurring over a specific period of time.</p>
<p><img src="https://lh7-us.googleusercontent.com/jokaYjr2b8Q4Yt2LZMhaJPuNNtTX-do0TIv1aOFYvdl9cVd3CrVWOQoa3tjwgJgvCGNFGYyJK-zotbBlu6zVbIjpIhIdjEAQv1T_ik6Y3U6WPRXRGbVD7BplAmUvT8TQG6E8NExxNEzL2e5X3swZ_Q" alt /></p>
<p>It's called a "performance timeline" because these events are specifically related to performance. These performance-related events are known as performance entries, which we'll discuss later.</p>
<p>The performance timeline concept is a mental abstraction. No code backs it up. Some events within the timeline <em>can</em> be buffered (stored temporarily) for later analysis. However, not all of them can be buffered. Therefore, those buffered events are not complete representations of the timeline but some part of it.</p>
<h3 id="heading-performance-entries">Performance entries</h3>
<p>Performance entries represent the events that happen during program execution. Node.js provides the following types:</p>
<ul>
<li><p>Mark</p>
</li>
<li><p>Measure</p>
</li>
<li><p>Resource</p>
</li>
<li><p>Node</p>
</li>
</ul>
<p>The first three types (Mark, Measure, and Resource) are defined by the W3C specification, which Node.js aims to adhere to closely.</p>
<p>The fourth type, Node, is specific to Node.js. It's an abstract type that combines <code>net</code>, <code>dns</code>, <code>gc</code>, <code>http</code>, <code>http2</code>, and <code>function</code>.</p>
<p>The abstract Node types include many different performance entry types for the following two reasons:</p>
<ul>
<li><p>They are all created only after some action is finished. For example, if a function finishes the execution or if we do a DNS lookup, the performance entry will be created only after the lookup is done.</p>
</li>
<li><p>They are only available inside of the Node.js, not in the browser.</p>
</li>
</ul>
<p>The Mark and Measure types are also called user timings because the user decides when to create an entry. For example, you can create a Mark performance entry right in the function execution process, not strictly before or after.</p>
<p>The fetch function is the <strong>only</strong> one responsible for creating Resource entry types. This type is special because it is compatible with W3C specifications and can be used in web browsers, but it is not as flexible as user timings.</p>
<h3 id="heading-performance-observer">Performance observer</h3>
<p>We’ve discussed the performance timeline and performance entries. The next logical step is to see the performance data. The measurements. That is where performance observer comes into play.</p>
<p>It enables you to collect and work with the performance entries that the program generates without much sweat.</p>
<p>To start using performance observer, you need first to configure it:</p>
<ul>
<li><p>Create a performance observer and provide a callback function. The function is called whenever an entity you want to observe is created.</p>
</li>
<li><p>Call the <code>observe</code> method and provide configuration options as the function arguments.</p>
</li>
</ul>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { PerformanceObserver, performance } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:perf_hooks'</span>;

<span class="hljs-keyword">const</span> obs = <span class="hljs-keyword">new</span> PerformanceObserver(<span class="hljs-function"><span class="hljs-params">list</span> =&gt;</span> {
 <span class="hljs-comment">// Process the list of performance entries.</span>
 <span class="hljs-comment">// The list contains the test performance mark entry.</span>
});

<span class="hljs-comment">// Configuration of the observe method</span>
<span class="hljs-comment">// where we want to monitor the mark entries</span>
obs.observe({ <span class="hljs-attr">entryTypes</span>: [<span class="hljs-string">'mark'</span>] });

<span class="hljs-comment">// Callback of the performance observable is triggered</span>
<span class="hljs-comment">// because of the observe function configuration.</span>
performance.mark(<span class="hljs-string">'test'</span>);
</code></pre>
<p>Performance entries that you’re not interested in don’t trigger the performance observer callback.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { PerformanceObserver, performance } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:perf_hooks'</span>;

<span class="hljs-keyword">const</span> obs = <span class="hljs-keyword">new</span> PerformanceObserver(<span class="hljs-function"><span class="hljs-params">list</span> =&gt;</span> {
  <span class="hljs-comment">// Process the list of performance entries</span>
});

<span class="hljs-comment">// Observe function configuration</span>
obs.observe({ <span class="hljs-attr">entryTypes</span>: [<span class="hljs-string">'function'</span>] });

<span class="hljs-comment">// The callback is not triggered because the entry type</span>
<span class="hljs-comment">// doesn’t match the observe function configuration</span>
performance.mark(<span class="hljs-string">'test'</span>);
</code></pre>
<p>Overall, this approach is a flexible way of observing different performance entry types.</p>
<h3 id="heading-when-to-start-observing">When to start observing?</h3>
<p>The next important topic is when to start observing entries with the performance observer. From this point on, it gets deep, so be ready.</p>
<p>There are only two options: after and before creating a performance entry. I strongly recommend creating a performance entry <em>after</em> the call of the <code>observe</code> method.</p>
<p>The reason is simple: it makes code predictable.</p>
<p>Consider the following example where we create a performance entry before calling the <code>observe</code> method:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { PerformanceObserver, performance } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:perf_hooks'</span>;

<span class="hljs-keyword">const</span> obs = <span class="hljs-keyword">new</span> PerformanceObserver(<span class="hljs-function"><span class="hljs-params">list</span> =&gt;</span> {
 <span class="hljs-built_in">console</span>.log(list);
});

performance.mark(<span class="hljs-string">'performance-mark'</span>);

obs.observe({ <span class="hljs-attr">entryTypes</span>: [<span class="hljs-string">'mark'</span>] });
</code></pre>
<p>The expected behavior is to see all related performance entries in the console. However, you still won't see anything in the console despite creating a matching performance entry type (the performance observer is configured for the Mark entry types, and we’ve created exactly one).</p>
<p>Why? Because of the specific way the performance hooks work. Let me explain.</p>
<p>The performance hooks manage several “global” (scoped inside of a file) variables, including <a target="_blank" href="https://github.com/nodejs/node/blob/main/lib/internal/perf/observe.js#L118">the set of observers</a>. The observer is added to this set not when we call the constructor of PerformanceObserver but <a target="_blank" href="https://github.com/nodejs/node/blob/main/lib/internal/perf/observe.js#L318">when we call the</a> <code>observe</code> method.</p>
<p>The observer’s callback is triggered whenever a new performance entry is created. In our case, performance.mark('performance-mark') <a target="_blank" href="https://github.com/nodejs/node/blob/main/lib/internal/perf/usertiming.js#L159">creates</a> the Mark performance entry and <a target="_blank" href="https://github.com/nodejs/node/blob/main/lib/internal/perf/observe.js#L396">calls</a> all <strong>existing</strong> observers interested in this type of performance entry.</p>
<p>In summary, the performance observer's callback wasn’t invoked simply because the observer didn’t exist when the performance entry was created.</p>
<p>Here is a picture to better illustrate the process:</p>
<p><img src="https://lh7-us.googleusercontent.com/W6Kkz5xpNzbsyqdnpjuwNqConRNv-J0pnKVKTBx8Hf94XHOqm3biYwKzyqdHYosdKPxFmzC_G9ad9-43zeJgE2aB-h8N1L1yu-Pe5Q_oGlNA3AhukExLhh3bvr_umiE82wqpYMIFmnwiLhVbKn7NhA" alt /></p>
<h3 id="heading-what-are-buffers">What are buffers?</h3>
<p>Another important concept is buffers. Buffers enable you to get the historical sequence of performance entries that the program creates.</p>
<p>Don’t be afraid of the fancy word “buffer.” In reality, those are just arrays.</p>
<p>You should be aware of two types of buffers: local and global.</p>
<p>The <a target="_blank" href="https://github.com/nodejs/node/blob/main/lib/internal/perf/observe.js#L244">local buffer</a> is only related to the performance observer. This local buffer stores a sequence of events related to this particular observer. If the observer isn’t interested in some event types, <a target="_blank" href="https://github.com/nodejs/node/blob/main/lib/internal/perf/observe.js#L343">they aren’t buffered</a>.</p>
<p>Important note: those performance entries are buffered only for a period of time before creating a performance entry and calling the observer callback. After that, the buffer gets cleared. It allows us to see two events at the same in one callback call instead of having two separate ones:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">const</span> obs = <span class="hljs-keyword">new</span> PerformanceObserver(<span class="hljs-function">(<span class="hljs-params">list</span>) =&gt;</span> {
 <span class="hljs-comment">// The list contains two performance mark entries.</span>
 <span class="hljs-built_in">console</span>.log(list);
});

obs.observe({ <span class="hljs-attr">entryTypes</span>: [<span class="hljs-string">'mark'</span>] });

performance.mark(<span class="hljs-string">'performance-mark-1'</span>);
performance.mark(<span class="hljs-string">'performance-mark-2'</span>);
</code></pre>
<p>When it comes to global buffers, there are <a target="_blank" href="https://github.com/nodejs/node/blob/main/lib/internal/perf/observe.js#L103">three of them</a>:</p>
<ul>
<li><p>Performance mark entries buffer.</p>
</li>
<li><p>Performance measure entries buffer.</p>
</li>
<li><p>Performance resource entries buffer.</p>
</li>
</ul>
<p>Let’s look at the same example as with the local buffer but slightly modify it.</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> { PerformanceObserver, performance } <span class="hljs-keyword">from</span> <span class="hljs-string">'node:perf_hooks'</span>;

performance.mark(<span class="hljs-string">'performance-mark-1'</span>);

<span class="hljs-keyword">const</span> obs = <span class="hljs-keyword">new</span> PerformanceObserver(<span class="hljs-function">(<span class="hljs-params">list</span>) =&gt;</span> {
 <span class="hljs-comment">// prints only performance mark #2</span>
 <span class="hljs-built_in">console</span>.log(list);

 <span class="hljs-comment">// prints both performance marks because it works with global buffers</span>
 <span class="hljs-built_in">console</span>.log(performance.getEntries());
});

obs.observe({ <span class="hljs-attr">entryTypes</span>: [<span class="hljs-string">'mark'</span>] });

performance.mark(<span class="hljs-string">'performance-mark-2'</span>);
</code></pre>
<p>You’ll see only one performance mark entry in the first console log because only one is called after the observer method invocation.</p>
<p>The performance.getEntries function gets data directly from the global buffers. It means we’ll see two performance entries in the second console log, even though one was created before the observer's creation.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>After reading this article, you should have a solid understanding of the core concepts related to the performance hooks module such as:</p>
<ul>
<li><p>Monotonic and wall clocks</p>
</li>
<li><p>Performance timeline</p>
</li>
<li><p>Performance entries</p>
</li>
<li><p>Performance observer</p>
</li>
<li><p>When to start observing the entries?</p>
</li>
<li><p>Performance hooks buffers</p>
</li>
</ul>
<p>These concepts create a good foundation in terms of mental model and some API specifics.</p>
<p>Now, you shouldn’t have any problems using performance hooks and building your own abstractions upon them.</p>
]]></content:encoded></item><item><title><![CDATA[Think twice before using PM2 — a critical look at the popular tool]]></title><description><![CDATA[If you work with Node.js, you've probably heard of PM2. It's a popular process manager for Node.js applications and comes with several extra features, like load balancing and monitoring.
But just because it's popular doesn't mean it's the best choice...]]></description><link>https://pavel-romanov.com/think-twice-before-using-pm2-a-critical-look-at-the-popular-tool</link><guid isPermaLink="true">https://pavel-romanov.com/think-twice-before-using-pm2-a-critical-look-at-the-popular-tool</guid><category><![CDATA[Node.js]]></category><category><![CDATA[pm2]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Sat, 11 May 2024 09:51:28 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1715420738880/76ab7354-4b86-4705-8bd1-775ec23362e3.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you work with Node.js, you've probably heard of PM2. It's a popular process manager for Node.js applications and comes with several extra features, like load balancing and monitoring.</p>
<p>But just because it's popular doesn't mean it's the best choice for every project. In fact, there are some good reasons why you might want to think twice before using it.</p>
<p>This article takes a closer look at PM2 and the problems it can cause. We'll explore situations where other tools, like Docker, might be a better fit.</p>
<p>We'll discuss PM2's downsides, including its proprietary features and licensing issues. By the end, you'll have a clear picture of whether PM2 is right for your Node.js projects.</p>
<h2 id="heading-problems">Problems</h2>
<p>Before you jump on the PM2 bandwagon, it's crucial to understand the potential trade-offs and challenges it presents.</p>
<h3 id="heading-overselling-the-cluster-feature">Overselling the cluster feature</h3>
<p>Most of the online material mentions things like “PM2 has built-in load balancer” or “You can run the application in cluster module and take higher traffic volumes.”</p>
<p>The problem here is people talking about those features without a context. The context here is that PM2 uses a Node cluster module to make those things possible, and I already covered <a target="_blank" href="https://pavel-romanov.com/nodejs-cluster-module-you-probably-dont-need-it">why you probably don’t need the cluster module</a>.</p>
<p>In short, it doesn’t make much sense to have all of those features inside of a single Node server, especially when it comes to modern pipelines, CI/CD practices, and modern deployment strategies.</p>
<p>While it is not directly related to PM2 itself but more to the community around it, it doesn't change the fact that you'll most often encounter clusters in applications that use PM2 than without it.</p>
<h3 id="heading-proprietary-monitoring">Proprietary monitoring</h3>
<p>One of the features that PM2 itself sells hard is monitoring. Monitoring is a good thing to have, and it shows you how your application and server are doing. The more data you have, the better decisions you can make. People build multi-million dollar businesses on it; take <a target="_blank" href="https://www.datadoghq.com/">Datadog</a> or <a target="_blank" href="https://sentry.io/welcome/">Sentry</a> as an example.</p>
<p>The problem with PM2 monitoring is vendor lock-in. You cannot take the data from what PM2 has collected and move it to a third-party monitoring service — at least not that easily. It all comes down to money. PM2 itself provides an advanced monitoring service, but to have access to it, <a target="_blank" href="https://pm2.io/pricing">you have to pay</a>.</p>
<p>It doesn’t mean you cannot use other services' monitoring. However, if you think of monitoring as one of the PM2 cool features, you have to think about it again.</p>
<h3 id="heading-attempts-to-solve-too-many-problems">Attempts to solve too many problems</h3>
<p>It feels like PM2 is trying to solve the problems it shouldn’t be solving but is doing so because “why not?”</p>
<p>It leads to poor results. Those problems are not solved well enough, and you suffer from the consequences.</p>
<p>To name a few:</p>
<ul>
<li><p><strong>Memory management:</strong> The process manager tries to manage the memory of the running processes by restarting a process based on the consumed memory. However, it has a 30-second window in which it checks memory consumption. Even if the process goes beyond the limit, it might take up to 30 seconds to shut it down, and you cannot do much about it.</p>
</li>
<li><p><strong>Persistent application:</strong> PM2 can restart the process automatically whenever the machine restarts. It uses <code>systemd</code> daemon, which is used on UNIX-based OS. Because of it, you can run into problems with it on Windows. The other thing is you have to hassle around and update <code>systemd</code> daemon service whenever you want to change the Node.js version.</p>
</li>
<li><p><strong>Deployment:</strong> Doesn’t it sound odd that the process manager is somehow involved in the deployment process? This limits you to a bare metal deployment, which means no containers.</p>
</li>
</ul>
<p>There are more to the list, but you get the idea.</p>
<h3 id="heading-hard-to-justify-usage-with-containers">Hard to justify usage with containers</h3>
<p>If you use containers in your Node.js application, PM2 becomes redundant.</p>
<p>One of the most popular solutions for working with containers, Docker, provides many similar PM2 features. The difference? They are <strong>a lot</strong> more reliable.</p>
<p>As a contrary example to the PM2 problems:</p>
<ul>
<li><p><strong>Superior resources management:</strong> You can limit memory usage, CPU, and GPU. Those limits are more flexible and reliable. For example, if a container reaches the limit of consumed memory, you’ll know it almost immediately. There is no fixed 30-second window.</p>
</li>
<li><p><strong>Reliable application persistence:</strong> It is possible to configure restart policies for Docker containers. Whenever the host machine is restarted, the Docker daemon spins up and restarts all containers configured for it. The best part is that it works seamlessly on Unix-based OSes as well as Windows. Also, you don’t need to do manual work whenever you change the Node.js version. The container will do it for you.</p>
</li>
<li><p><strong>Deployment:</strong> Containers make it extremely easy to deploy new applications because of their isolated environment. However, they do not attempt to deploy themselves as PM2 does.</p>
</li>
</ul>
<h3 id="heading-licensing">Licensing</h3>
<p>Another major issue is the licensing under which PM2 is distributed. It uses GNU AGPL-3 license. While the license might not say much to you, here are some of the implications of the license:</p>
<ul>
<li><p>If your project uses a library licensed under the GNU AGPL-3 and distribute it over the internet, you are automatically forced to distribute it under the same license.</p>
</li>
<li><p>Some of the licenses are not compatible with the GNU APGL-3. If you’re using more than one library — I assume you do — you must ensure no conflicting licenses.</p>
</li>
<li><p>Projects distributed over a network, like web applications, must allow users to see and download their source code.</p>
</li>
</ul>
<p>As you can see, those are not the best conditions, especially if we compare them with MIT.</p>
<p>That’s the exact reason why <a target="_blank" href="https://opensource.google/documentation/reference/using/agpl-policy/">Google restricts</a> any usage of code under this license.</p>
<h2 id="heading-when-can-it-be-actually-useful">When can it be actually useful?</h2>
<p>To talk about the cases where it can be actually useful, we have to check a couple of points that we’ve discussed previously:</p>
<ul>
<li><p>You comply with the license and do everything it asks for.</p>
</li>
<li><p>You’re not using Docker or any other containerization software.</p>
</li>
</ul>
<p>The main benefit of PM2 is simplicity. It tries to solve many problems by being a simple process manager for this exact reason.</p>
<p>You don't need to learn how to deploy your application, configure a load balancer, configure daemons for automatic restarts, or set up a monitoring system manually.</p>
<p>Just use the tool.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>While there are certainly cases when you might see yourself using PM2, especially because of its simplicity, there are too many implications.</p>
<p>It tries to go beyond a simple process manager with the same user-friendly approach, which results in not-so-great implementations of those extra features, especially when compared with solutions that are meant to solve those problems, like resource management with containers.</p>
<p>Another huge pain point is licensing. Honestly, it is hard to imagine people willingly accepting the license condition. I assume that most users are simply not aware of the implications it brings.</p>
]]></content:encoded></item><item><title><![CDATA[5 Node Version Managers Compared – Which is Right for You?]]></title><description><![CDATA[Imagine you joined a Node.js project and want to bootstrap it to see how it goes, but you see an error. What is the problem? After spending some time, you find out that Node.js version you’re using on your machine is not the one that the project requ...]]></description><link>https://pavel-romanov.com/5-node-version-managers-compared-which-is-right-for-you</link><guid isPermaLink="true">https://pavel-romanov.com/5-node-version-managers-compared-which-is-right-for-you</guid><category><![CDATA[Node.js]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Sun, 05 May 2024 14:22:42 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1714918854470/b6f897a2-ef30-486f-9981-eb4f6c83abe8.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Imagine you joined a Node.js project and want to bootstrap it to see how it goes, but you see an error. What is the problem? After spending some time, you find out that Node.js version you’re using on your machine is not the one that the project requires.</p>
<p>This is quite a common and annoying situation. I have been there myself. To avoid such troubles, smart people developed tooling called “node version manager.” A shell utility that allows you to switch between Node.js versions easily.</p>
<p>This article will focus on the Node.js version managers market. You’ll see how they differ and which one you should consider using.</p>
<h2 id="heading-criteria">Criteria</h2>
<p>To make comparison easier, we’ll introduce the following criteria:</p>
<ul>
<li><p><strong>Cross-platform: Is the manager cross-platform?</strong></p>
</li>
<li><p><strong>Upfront setup: How much work must you do for the initial installation?</strong></p>
</li>
<li><p><strong>Node version sources: From what sources can the Node.js version be parsed?</strong></p>
</li>
<li><p><strong>Daily usage: How easy and seamless is using version manager daily?</strong></p>
</li>
</ul>
<h2 id="heading-contestants">Contestants</h2>
<ul>
<li><p><a target="_blank" href="https://github.com/nvm-sh/nvm">nvm</a></p>
</li>
<li><p><a target="_blank" href="https://github.com/tj/n">n</a></p>
</li>
<li><p><a target="_blank" href="https://github.com/Schniz/fnm">fnm</a></p>
</li>
<li><p><a target="_blank" href="https://github.com/volta-cli/volta">volta</a></p>
</li>
<li><p><a target="_blank" href="https://github.com/pnpm/pnpm">pnpm</a></p>
</li>
</ul>
<h2 id="heading-comparison">Comparison</h2>
<p>With defined criteria, we can now look at each contestant in more detail.</p>
<h3 id="heading-node-version-manager-nvm">Node Version Manager (NVM)</h3>
<p>It is the most popular solution for node version management (at least by GitHub repository stars, 75.2k). The reason for that is its early appearance. It was one of the first, if not the first, Node.js version managers at the time and gained huge popularity in the community.</p>
<p>Is it cross-platform? Not really. It doesn’t have full Windows support. It works in some cases like <a target="_blank" href="https://gitforwindows.org/">GitBash</a> (MSYS), <a target="_blank" href="https://cygwin.com/">Cygwin</a>, and WSL (Windows Subsystem for Linux). There is a separate package for Windows called <a target="_blank" href="https://github.com/coreybutler/nvm-windows">nvm-windows</a>, but it is not NVM itself.</p>
<p>Another limitation is the support of POSIX shells only, such as bash or zsh, which leaves users of other shells, like <a target="_blank" href="https://fishshell.com/">Fish</a>, out of the question.</p>
<p>The most straightforward way to install NVM is to run the following command.</p>
<pre><code class="lang-bash">curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
</code></pre>
<p>Here is how you use NVM to switch between different Node.js versions.</p>
<pre><code class="lang-bash">user@machine:~/project node -v
v21.7.2
user@machine:~/project cat .nvmrc
18.19.1
user@machine:~/project nvm use
user@machine:~/project node -v
v18.19.1
</code></pre>
<p>NVM can understand which version of Node.js to use through the <code>.nvmrc</code> file. You must either create one before switching between versions or explicitly declare Node.js version to which you want to switch, e.g. <code>nvm use 18.10</code>.</p>
<p>Notice that running the <code>nvm use</code> command will set the Node.js version for the <strong>current shell</strong>. What does this mean? Even if you leave the project folder and navigate to another project, the Node.js version will stay the same until you rerun the nvm use command.</p>
<p>It adds more friction to your workflow and creates a greater cognitive load because you must always be aware of what Node.js version your current shell uses and what is required for a particular project.</p>
<p>It is still better than manually managing all possible Node.js versions, but far from seamless integration.</p>
<h3 id="heading-n">N</h3>
<p>N is another popular Node.js version manager (18.5k GitHub starts).</p>
<p>It is not cross-platform and has even more limitations than NVM. It does not work in native shells on Microsoft Windows (like PowerShell), Git for Windows Bash, or with the Cygwin DLL.</p>
<p>N can be installed directly from NPM. Run <code>npm install -g n</code>. It can also be installed with the Brew on macOS or by downloading the sh script.</p>
<pre><code class="lang-bash">curl -L https://bit.ly/n-install | bash
</code></pre>
<p>One of the big benefits of using N is its ability to detect Node versions directly from the “engines” section. If you have the following package.json structure:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"name"</span>: <span class="hljs-string">"project"</span>,
  <span class="hljs-attr">"version"</span>: <span class="hljs-string">"1.0.0"</span>,
  <span class="hljs-attr">"main"</span>: <span class="hljs-string">"index.js"</span>,
  <span class="hljs-attr">"engines"</span>: {
    <span class="hljs-attr">"node"</span>: <span class="hljs-string">"18.17.0"</span>
  },
  <span class="hljs-attr">"scripts"</span>: {
    <span class="hljs-attr">"test"</span>: <span class="hljs-string">"echo \"Error: no test specified\" &amp;&amp; exit 1"</span>
  },
  <span class="hljs-attr">"keywords"</span>: [],
  <span class="hljs-attr">"license"</span>: <span class="hljs-string">"ISC"</span>
}
</code></pre>
<p>N will install “18.17.0” as you specified in the engines section.</p>
<p>However, N suffers from a similar problem to NVM. If you want to use the exact Node.js version for different projects, you have to keep track of it yourself.</p>
<p>Moreover, N takes this problem to a whole new level. Using N means managing a “global” Node.js version. Even after closing a shell, you’re left with the Node version you used for the latest project — not the best experience.</p>
<h3 id="heading-fast-node-manager-fnm">Fast Node Manager (FNM)</h3>
<p>FNM is a node version manager written in Rust. It is as popular as N (15.2k GitHub stars).</p>
<p>FNM is the first cross-platform node version manager on the list. It runs on Windows without the need to install any other packages.</p>
<p>The installation process is clear and intuitive.</p>
<pre><code class="lang-bash"><span class="hljs-comment"># macOS</span>
brew install fnm

<span class="hljs-comment"># Using rust package manager cargo</span>
cargo - cargo install fnm

<span class="hljs-comment"># On windows using winget</span>
winget - winget install Schniz.fnm
</code></pre>
<p>Or using bash script for Unix-based OS.</p>
<pre><code class="lang-bash">curl -fsSL https://fnm.vercel.app/install | bash
</code></pre>
<p>FNM manages Node.js version per shell and doesn’t try to build its main workflow through global versioning like N does. It has a “default” version which is global and it plays the role of fallback in case some project doesn’t have a specified Node.js version.</p>
<p>The other cool feature of FNM is the auto-switching of Node.js version based on the folder you’re in, but you have to do some configuration for it.</p>
<p>Auto-switching works in the following way: If you go from one project that uses the 18.17.0 version to a different one with 20.12.1, FNM automatically switches Node.js version after you navigate into the new project folder.</p>
<pre><code class="lang-bash">user@machine:~ node -v
V21.7.2
user@machine:~ <span class="hljs-built_in">cd</span> project-1
Using Node.js v18.17.0
user@machine:~/project-1 cat .node-version
18.17.0
user@machine:~/project-1 <span class="hljs-built_in">cd</span> ..
user@machine:~ node -v
V18.17.0
user@machine:~ <span class="hljs-built_in">cd</span> project-2
Using Node.js V20.12.1
user@machine:~/project-2 cat .node-version
v20.12.1
</code></pre>
<p>We switched between two projects, and the Node version automatically changed based on the .node-version file inside those projects.</p>
<p>You must install the required Node.js version on a machine to make auto-switching work properly.</p>
<p>The other thing to remember is that it can detect node versions only from extra files you create in a project. Those files are .node-version or .nvmrc.</p>
<h3 id="heading-volta">Volta</h3>
<p>Volta is a rising star in the world of version managers (10k starts on GitHub).</p>
<p>It is written in Rust, and it is cross-platform.</p>
<p>The installation process is seamless for Unix-based systems.</p>
<pre><code class="lang-bash">curl https://get.volta.sh | bash
</code></pre>
<p>Windows has a separate installer.</p>
<p>When you configure Volta's Node.js version, you do not need to create extra files. Volta uses configuration from <code>package.json</code>.</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"name"</span>: <span class="hljs-string">"project"</span>,
  <span class="hljs-attr">"version"</span>: <span class="hljs-string">"1.0.0"</span>,
  <span class="hljs-attr">"main"</span>: <span class="hljs-string">"index.js"</span>,
  <span class="hljs-attr">"engines"</span>: {
    <span class="hljs-attr">"node"</span>: <span class="hljs-string">"18.17.0"</span>
  },
}
</code></pre>
<p>The benefit of such a configuration is that the <code>engines</code> section is right next to the Volta configuration. This allows you to keep them in sync effortlessly. When placed in a separate file, it is easy to forget to sync those versions.</p>
<p>Another huge feature is the management of a toolchain. What does it mean?</p>
<p>Imagine that you’re using Yarn as a package manager. Other Node.js version managers can manage only Node.js versions. At the same time, Yarn version can change from project to project.</p>
<p>That is where Volta shines. You can dynamically switch not only the Node.js version but Yarn version as well. Just add Yarn version under the “volta” configuration section.</p>
<pre><code class="lang-json">  <span class="hljs-string">"volta"</span>: {
    <span class="hljs-attr">"node"</span>: <span class="hljs-string">"18.17.0"</span>,
    <span class="hljs-attr">"yarn"</span>: <span class="hljs-string">"1.22.22"</span>
  }
</code></pre>
<p>Whenever you run the install command, Volta, ensure that Node.js and Yarn versions match the declared one. Isn’t it magical?</p>
<h3 id="heading-pnpm">PNPM</h3>
<p>Don’t be surprised. PNPM is usually perceived as an alternative to package managers like NPM and Yarn. However, unlike those, PNPM can manage Node.js version itself.</p>
<p>PNPM is cross-platform and provides the same Node.js version management experience throughout all platforms.</p>
<p>However, there are four downsides to using PNPM as a Node version manager.</p>
<p>The first one is that PNPM is not a Node version manager at its core. It is a package manager that can manage the Node.js version. You can’t easily use it with other package managers like NPM or Yarn.</p>
<p>The second one is that Node.js installed using PNPM is not shipped with Corepack. Here is the note from <a target="_blank" href="https://pnpm.io/cli/env">the documentation</a>:</p>
<p>PNPM env does not include the binaries for Corepack. If you want to use Corepack to install other package managers, you need to install it separately (e.g. PNPM <code>add -g corepack</code>).</p>
<p>The third one is that PNPM can only manage the Node.js version globally. You can’t configure it per-shell. If you try to install without the --global flag, you get the following error message:</p>
<p><strong><em>"pnpm env use &lt;version&gt;" can only be used with the "--global" option currently</em></strong></p>
<p>It doesn’t switch Node.js version dynamically as you navigate from project to project. This means you have to track it all yourself and ensure it matches the version required for a project.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>The Node.js version managers have come a long way. NVM was the first and most popular solution for quite a long time and remains one.</p>
<p>But the ecosystem is evolving. Over time, different tools, such as N, FNM, and Volta, emerged. Each has pros and cons.</p>
<p>At this point, Volta seems to be our most feature-reach and complete Node.js version manager. It is cross-platform, provides a seamless experience in day-to-day usage, and takes care of other tools you use on the project.</p>
]]></content:encoded></item><item><title><![CDATA[The Ultimate Guide to Cron Jobs in Node.js]]></title><description><![CDATA[One of the most common features of each application is task scheduling. For example, an application needs to send email reports about some operations once every 8 hours to a certain number of users.
While it is a common task, people in the Node.js co...]]></description><link>https://pavel-romanov.com/the-ultimate-guide-to-cron-jobs-in-nodejs</link><guid isPermaLink="true">https://pavel-romanov.com/the-ultimate-guide-to-cron-jobs-in-nodejs</guid><category><![CDATA[Node.js]]></category><category><![CDATA[cronjob]]></category><category><![CDATA[bullmq]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Tue, 30 Apr 2024 12:15:47 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1714479234049/ba24b819-d6ce-427e-967f-f99d1a5ba153.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>One of the most common features of each application is task scheduling. For example, an application needs to send email reports about some operations once every 8 hours to a certain number of users.</p>
<p>While it is a common task, people in the Node.js community still get confused about how to implement it, what options we have to create and manage those tasks, and the pros and cons of each option.</p>
<p>In this article, we’ll answer all those questions, giving you a clear picture of the tools you have at your disposal.</p>
<h2 id="heading-terminology">Terminology</h2>
<p>The world of task scheduling uses different terms that basically mean the same thing: jobs or tasks that the application performs on a schedule.</p>
<p>Many people call those cron jobs. However, this term can cause more confusion, especially if you’re just starting your journey on the backend.</p>
<p>For example, cron is a UNIX-based scheduler. What does it have to do with your application if you're running it on a Windows server?</p>
<p>A better term for it would be “scheduled tasks.” The name itself is crystal clear; you don’t have to second-guess what it stands for.</p>
<p>This term will be used throughout the rest of the article.</p>
<h2 id="heading-factors-to-consider-before-choosing-a-task-scheduling-approach">Factors to consider before choosing a task scheduling approach</h2>
<p>Before exploring the different tools and approaches, it's important to understand your application's specific requirements. Considering the following factors will help you to make an informed decision.</p>
<ul>
<li><p>Application and infrastructure scale</p>
</li>
<li><p>High-frequency tasks</p>
</li>
<li><p>Long-running tasks</p>
</li>
<li><p>Task stacking</p>
</li>
</ul>
<p>Let's explore each of these factors in more detail.</p>
<h3 id="heading-application-and-infrastructure-scale">Application and infrastructure scale</h3>
<p>You have to understand the scale of your application and underlying infrastructure. It’ll allow you to make a reasonable decision on what approach to choose.</p>
<p>To name a few cases:</p>
<ul>
<li><p>A single instance of application is running on a single server</p>
</li>
<li><p>Multiple instances of application are running on a single server</p>
</li>
<li><p>Multiple instances of applications are running on multiple servers</p>
</li>
</ul>
<p>It is not limited only to those 3. You might have a different setup and infrastructure configuration. The point is that you have to understand it before deciding on the task scheduling approach.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714477369297/1828742e-568b-4847-9323-5a6293e2a8c6.jpeg" alt class="image--center mx-auto" /></p>
<h3 id="heading-high-frequency-tasks">High-frequency tasks</h3>
<p>If you need to run tasks at intervals shorter than 1 minute, your options will be more limited. Some scheduling solutions don't support high-frequency tasks out of the box at all or require additional workarounds.</p>
<p>Consider the minimum interval between tasks that your application requires, and ensure that the chosen solution fits those needs.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714477405120/19d2d4a7-7077-4dea-8d7e-c3d023181202.jpeg" alt class="image--center mx-auto" /></p>
<h3 id="heading-long-running-tasks">Long-running tasks</h3>
<p>Long-running tasks introduce their own set of challenges. These tasks can exacerbate issues like memory leaks, which may not be apparent in applications without long-running processes. When implementing long-running tasks, consider the following challenges:</p>
<ul>
<li><p><strong>Debugging complexity</strong>: Long-running tasks can be harder to debug due to their extended runtime and potential interactions with other parts of the system.</p>
</li>
<li><p><strong>Maintenance and updates:</strong> Careful planning is required when performing maintenance or updates on applications with long-running tasks to ensure minimal disruption and proper handling of in-progress tasks.</p>
</li>
<li><p><strong>Resource management</strong>: Long-running tasks consume system resources over an extended period. Proper resource management, such as memory and CPU usage, is crucial to avoid performance degradation and memory leaks.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714477452079/352dd3b6-8f06-4853-8e1e-a1a3d865a107.jpeg" alt class="image--center mx-auto" /></p>
<h3 id="heading-tasks-stacking">Tasks stacking</h3>
<p>Task stacking occurs when a new task is started before the previous one has been completed. This can happen when the scheduled interval is shorter than the task's execution time.</p>
<p>Task stacking can lead to resource contention, performance degradation, and unexpected behavior. When selecting a scheduling solution, consider how it handles task stacking and whether it provides mechanisms to prevent or manage such situations.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714477487651/1b3787c0-89ef-462f-8366-07fbfb488692.jpeg" alt class="image--center mx-auto" /></p>
<h2 id="heading-tasks-scheduling-approaches">Tasks scheduling approaches</h2>
<p>While all scheduled tasks share the common characteristics of having a specified execution time and associated code, they differ in how and where they are scheduled and managed. In this section, we'll explore the various approaches to scheduling and managing tasks, including:</p>
<ul>
<li><p>UNIX-based scheduling with cron</p>
</li>
<li><p>Runtime scheduling</p>
</li>
<li><p>Runtime scheduling with persistence</p>
</li>
<li><p>Cloud-based scheduling solutions</p>
</li>
</ul>
<p>Each approach has its own pros and cons, which we'll discuss in detail.</p>
<h3 id="heading-unix-based-scheduling-with-cron">UNIX-based scheduling with cron</h3>
<p>Cron is a long-standing and widely-used scheduler in UNIX-based systems. It allows you to schedule tasks, known as "cron jobs," to run at specific intervals of time.</p>
<p>The interface through which you schedule tasks to cron is called crontab (cron table). Crontab uses a <a target="_blank" href="https://en.wikipedia.org/wiki/Cron">specific syntax</a> to define the schedule.</p>
<p>Cron runs a cron daemon, a single process responsible for managing all the scheduled tasks. When a new task is scheduled, the daemon starts monitoring it with a frequency of 1 minute. When a task is due to run, cron creates a separate process for that task and executes it.</p>
<p><strong>Pros:</strong></p>
<ul>
<li><p><strong><em>Flexibility</em>:</strong> Cron is a versatile tool for running various types of scripts and programs on a scheduled basis.</p>
</li>
<li><p><strong><em>Process isolation</em>:</strong> Each cron job runs in its own separate process, providing a level of isolation and minimizing the impact of one task on others.</p>
</li>
<li><p><strong><em>Reliability</em>:</strong> Cron has been around since 1975 and has proven to be a reliable and stable solution for task scheduling.</p>
</li>
</ul>
<p><strong>Cons:</strong></p>
<ul>
<li><p><strong><em>Single machine limitation</em>:</strong> By default, cron is limited to a single machine, which can make it harder to scale for larger workloads.</p>
</li>
<li><p><strong><em>Task stacking</em>:</strong> If a task takes longer to execute than its scheduled interval, multiple instances of the task may start running concurrently, leading to resource contention. Cron doesn't have built-in mechanisms to prevent this.</p>
</li>
<li><p><strong><em>Limited fault tolerance</em>:</strong> Cron doesn't provide built-in error handling functionality. If a task fails, cron won’t automatically retry it. You need to implement your own error handling and retry mechanisms.</p>
</li>
</ul>
<h3 id="heading-runtime-scheduling">Runtime scheduling</h3>
<p>Runtime scheduling refers to scheduling tasks directly within the application code. This approach is facilitated by various ready-to-use libraries in the Node.js ecosystem, such as <a target="_blank" href="https://www.npmjs.com/package/node-cron">node-cron</a>, <a target="_blank" href="https://www.npmjs.com/package/cron">corn</a>, <a target="_blank" href="https://www.npmjs.com/package/croner">croner</a>, and others. While these libraries may have different implementation details, they all operate on the same fundamental principle.</p>
<p>Under the hood, these libraries typically use <code>setTimeout</code> or <code>setInterval</code> functions to schedule tasks. When your application starts, the scheduled jobs are initialized and set to run at their specified intervals.</p>
<p><strong>Pros:</strong></p>
<ul>
<li><p><strong><em>Simplicity</em>:</strong> Runtime scheduling is easy to set up and get running, as it doesn't require any additional infrastructure or external dependencies.</p>
</li>
<li><p><strong><em>Quick development</em>:</strong> For simple use cases or prototypes, runtime scheduling allows you to implement task scheduling quickly without the overhead of more complex solutions.</p>
</li>
</ul>
<p><strong>Cons:</strong></p>
<ul>
<li><p><strong><em>Resource contention</em>:</strong> Not all solutions execute tasks in a separate process/thread. This means that if you run CPU-intensive logic inside a task, you have to be aware of particular implementation details to ensure that those tasks won’t block the main thread.</p>
</li>
<li><p><strong><em>Deployment challenges</em>:</strong> When deploying a new version of the application, all scheduled tasks will be rescheduled because all timers run inside of the same process as the application code (unless you separate it into a dedicated, independent process). It results in delayed task execution.</p>
</li>
<li><p><strong><em>Scalability limitations</em>:</strong> As the application scales beyond a single instance, managing runtime-scheduled tasks becomes increasingly difficult. Each application instance runs the same code and schedules the same tasks, leading to duplication and conflicts.</p>
</li>
<li><p><strong><em>Task stacking</em>:</strong> If a scheduled task takes longer to execute than its specified interval, multiple instances of the task may start running concurrently. This can result in resource contention and unexpected behavior, especially for long-running tasks.</p>
</li>
</ul>
<p>Runtime scheduling can be a good fit for simple applications or prototypes where you want to test and visualize scheduled jobs quickly. However, it may not be the most suitable approach for production-grade applications that require scalability, reliability, and efficient resource utilization.</p>
<p>In this case, both task data and timers reside within the application.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714478016202/b5a76dac-064f-4fe1-874d-e8e89b7a8404.jpeg" alt class="image--center mx-auto" /></p>
<h3 id="heading-runtime-scheduling-with-persistence">Runtime scheduling with persistence</h3>
<p>Runtime scheduling with persistence builds upon the basic runtime scheduling approach and introduces a persistence mechanism to store and manage scheduled tasks.</p>
<p>By incorporating a persistence layer, the application can store information about scheduled tasks in a database or a persistent storage system. This allows the application to track and recover tasks even after a restart or a new deployment, ensuring that tasks are executed on time.</p>
<p>Two notable libraries in the Node.js ecosystem provide runtime scheduling with persistence: Bull (or BullMQ) and Agenda.</p>
<p>Bull is a popular library that uses Redis, an in-memory data store, as its persistence mechanism.</p>
<p>Agenda is another library that provides runtime scheduling with persistence using MongoDB. It's important to note that the Agenda may not be actively maintained, with the last major release dating back to November 2022 (as of May 2024).</p>
<p>Unlike Ageda, Bull is actively maintained. The downside of Bull is it uses in-memory databases. If your server goes down, all jobs are lost.</p>
<p><strong>Pros:</strong></p>
<ul>
<li><p><strong><em>Persistence</em>:</strong> Runtime scheduling with persistence ensures that scheduled tasks are not lost during application restarts or deployments. The persistence mechanism allows you to recover and resume tasks from where they left off.</p>
</li>
<li><p><strong><em>Scalability</em>:</strong> Abstracting task management from the main application and storing tasks in a separate persistence layer makes it easier to scale the application in the future. Multiple application instances can share the same persistence layer and coordinate task execution.</p>
</li>
<li><p><strong><em>No task stacking</em>:</strong> When using Bull, the queue-based structure ensures that tasks are executed in a reliable and orderly manner. New tasks are only started after the previous ones are completed, preventing all problems related to task stacking.</p>
</li>
</ul>
<p><strong>Cons:</strong></p>
<ul>
<li><p><strong><em>Complexity</em>:</strong> Implementing runtime scheduling with persistence requires additional setup, configuration, and learning compared to basic runtime scheduling. You need to set up and manage the persistence layer (e.g., Redis or MongoDB) and integrate it with your application.</p>
</li>
<li><p><strong><em>Dependency on external systems</em>:</strong> Relying on external systems like Redis or MongoDB introduces additional points of failure. If the persistence layer experiences issues or downtime, it can affect the task scheduling workflow.</p>
</li>
</ul>
<p>Runtime scheduling with persistence offers a more robust and reliable approach compared to basic runtime scheduling. It addresses the issue of losing scheduled tasks during restarts and deployments and provides better scalability options. However, it also introduces additional complexity and dependencies on external systems.</p>
<p>With this approach, we move task information to the persistence layer. The application is now only responsible for running timers and executing tasks themselves (in case we still run everything in a single application instance).</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714478321191/972ee834-a7d5-4367-91dd-382b5933fed1.jpeg" alt class="image--center mx-auto" /></p>
<h3 id="heading-cloud-based-scheduling-solutions">Cloud-based scheduling solutions</h3>
<p>The other option is to move to the cloud. In this case, you don’t need to use any of the previous solutions to create, schedule, and persist tasks.</p>
<p>Popular cloud providers, such as Amazon Web Services (AWS) and Google Cloud Platform (GCP), offer dedicated services for scheduling tasks. For example, AWS provides the EventBridge Scheduler, while GCP offers the Cloud Scheduler.</p>
<p><strong>Pros:</strong></p>
<ul>
<li><p><strong><em>Decoupled architecture</em>:</strong> Cloud-based scheduling solutions decouple the scheduling logic from your application code. This separation of concerns makes it easier to scale and maintain your application independently of the scheduling infrastructure.</p>
</li>
<li><p><strong><em>Scalability and reliability</em>:</strong> Cloud providers offer highly scalable and reliable scheduling services. They handle the underlying infrastructure, ensuring that tasks are executed on time and with high availability. You don't need to worry about managing the scheduling infrastructure yourself.</p>
</li>
<li><p><strong><em>Flexibility and integration</em>:</strong> Cloud-based scheduling solutions often provide flexible scheduling options, such as cron-based scheduling or more advanced scheduling patterns. They also integrate well with other cloud services, allowing you to build complex workflows and data pipelines.</p>
</li>
</ul>
<p><strong>Cons:</strong></p>
<ul>
<li><p><strong><em>Learning curve</em>:</strong> Cloud-based scheduling solutions require an understanding of a specific cloud platform and its services.</p>
</li>
<li><p><strong><em>Cost considerations</em>:</strong> While cloud-based scheduling solutions offer scalability and convenience, they come with associated costs. As your usage grows, the cost of using these services may increase. It's important to carefully evaluate the pricing models and estimate the long-term costs based on your application's needs.</p>
</li>
<li><p><strong><em>Complexity in code sharing</em>:</strong> If your application relies on code sharing between the main application and the scheduled tasks, using a cloud-based scheduling solution makes it increasingly harder to share the code. You may need to package and deploy your task code separately from your main application, which can require additional configuration and deployment processes.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714478674998/d671e114-e16d-486c-b976-47f52ebf7225.jpeg" alt class="image--center mx-auto" /></p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In the world of Node.js task scheduling, there is no one-size-fits-all solution.</p>
<p>When deciding on a task scheduling approach for your application, it's critical to evaluate your specific needs and trade-offs carefully. Consider the scalability and reliability requirements, the complexity of setup and maintenance, and the potential costs involved.</p>
<p>Be open to exploring different options, adapting as your needs evolve, and finding the solution that best fits your application's goals and constraints.</p>
]]></content:encoded></item><item><title><![CDATA[Resource management in Node.js: the good, the bad and the worst]]></title><description><![CDATA[In the previous article on resource management in Node.js, we covered the options available to manage resources. However, the previous article only provides a general overview.
In this article, we’ll see the pros and cons of using each of them. Spoil...]]></description><link>https://pavel-romanov.com/resource-management-in-nodejs-the-good-the-bad-and-the-worst</link><guid isPermaLink="true">https://pavel-romanov.com/resource-management-in-nodejs-the-good-the-bad-and-the-worst</guid><category><![CDATA[Node.js]]></category><dc:creator><![CDATA[Pavel Romanov]]></dc:creator><pubDate>Sun, 21 Apr 2024 12:15:31 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1713701588101/b09af24a-0fa1-4682-b9dc-b7779a2cc74e.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In <a target="_blank" href="https://pavel-romanov.com/node-resource-management">the previous article</a> on resource management in Node.js, we covered the options available to manage resources. However, the previous article only provides a general overview.</p>
<p>In this article, we’ll see the pros and cons of using each of them. Spoiler: some of the options don’t make much sense to use.</p>
<h2 id="heading-nodejs-cli-options">Node.js CLI options</h2>
<p>The Node.js CLI has two options for managing heap sizes: <code>--max-old-space-size</code> and <code>--max-semi-space-size</code>.</p>
<h3 id="heading-trade-offs">Trade-offs</h3>
<p>Those options cannot regulate everything. Here a just a few cases where CLI options won’t work for you.</p>
<p><strong>Spawned process.</strong> Whenever you spawn a new process via child_process.spawn, you’re creating a new instance of V8 alongside the process. It doesn’t inherit all of the options from the process that it was spawned from automatically. Sure, you can pass them manually, but now you have to be aware of each process and its memory consumption in the system.</p>
<p><strong>Addons.</strong> Addons are not directly related to JavaScript. In fact, they are C++ libraries accessible through JavaScript. This means the code running inside those libraries is not affected by V8 restrictions, at least in this case.</p>
<p><strong>I/O operations.</strong> I/O operations are handled by the libuv library and it means we’re facing C++ again. And not only that. When reading a large file, the result is passed as a buffer into the callback:</p>
<pre><code class="lang-javascript"><span class="hljs-keyword">import</span> fs <span class="hljs-keyword">from</span> <span class="hljs-string">'node:fs'</span>;

<span class="hljs-comment">// The data is Buffer object.</span>
fs.readFile(<span class="hljs-string">'large-file.txt'</span>, <span class="hljs-function">(<span class="hljs-params">err, data</span>) =&gt;</span> {
  <span class="hljs-keyword">if</span> (err) {
    <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error reading file:'</span>, err);
  } <span class="hljs-keyword">else</span> {
    <span class="hljs-built_in">console</span>.log(<span class="hljs-string">'File contents:'</span>, data.toString());
  }
});
</code></pre>
<p>Even when we have the buffer instance inside of JavaScript, the memory allocated for this buffer resides <a target="_blank" href="https://github.com/nodejs/node/issues/3012">outside of the V8 heap</a> and is, therefore, unaffected by the options.</p>
<h3 id="heading-when-to-use">When to use</h3>
<p>Use those when you want to specifically imply the limits of JavaScript-related code consumption to a single process.</p>
<p>They make garbage collection more efficient. When memory consumption gets closer to the size that you provided, the garbage collection runs more frequently, resulting in less memory usage.</p>
<h2 id="heading-pm2-process-manager">PM2 process manager</h2>
<p>PM2 is a popular JavaScript process manager. It has a feature to restart a certain process based on the memory it consumes. You can achieve this by specifying the <code>max_memory_restart</code> option.</p>
<h3 id="heading-trade-offs-1">Trade-offs</h3>
<p>The first and most significant tradeoff is how the process manager manages memory consumption. It has a separate worker process that checks memory consumption every 30 seconds.</p>
<p>This means <strong>it takes up to 30 seconds</strong> for the process that ran out of memory to be detected and restarted. These 30 seconds can cost a lot, up to the whole server crash.</p>
<p>The other is inherited problems that come with directly using PM2, such as:</p>
<ul>
<li><p>Overselling clustering that leads to cumbersome workflow.</p>
</li>
<li><p>Attempts to go beyond a simple process manager and deliver on those attempts purely, compared to the alternatives.</p>
</li>
<li><p>Licensing issues.</p>
</li>
</ul>
<p>I’ll soon publish an article that goes deep into the PM2 problems.</p>
<h3 id="heading-when-to-use-1">When to use</h3>
<p>To be honest, I don’t see any good reason to use PM2 at this point except only one specific use-case that I’ll describe in the upcoming article.</p>
<h2 id="heading-user-limits">User limits</h2>
<p>User limit is the way to set limits to resources based on the user's role in a unix-based system.</p>
<h3 id="heading-trade-offs-2">Trade-offs</h3>
<p>It is hard to scale. This type of limitation is especially challenging to scale if we want fine-grained control over specific processes or groups of processes.</p>
<p>Managing a large number of users, each with its own limits, quickly becomes complex.</p>
<h3 id="heading-when-to-use-2">When to use</h3>
<p>It is an excellent option to use as a second line of defense. For example, if you have some way of managing resource consumption of a process or process group but want to be sure that if something goes wrong, you have a backup plan.</p>
<h2 id="heading-control-groups">Control groups</h2>
<p>Control groups are meant to manage resource consumption on the system level, similar to user limits in their focus on resource management.</p>
<h3 id="heading-trade-offs-3">Trade-offs</h3>
<p>The only major problem I see with control groups is the configuration process. You might rightly say “skill issues.” However, the lack of clear interfaces through which we can configure them, like a single configuration file that resides next to the application code, makes it increasingly hard to manage.</p>
<p>One more issue could be a lack of complete isolation. With control groups, you’re still pretty much working on the same machine, with the shared file system, network, and any other resources.</p>
<h3 id="heading-when-to-use-3">When to use</h3>
<p>If you have enough skill and understanding of how they work, you can use them whenever you want. They are flexible enough to deliver on most of the tasks related to resource management.</p>
<h2 id="heading-containers">Containers</h2>
<p>Container is an abstraction that goes further than control groups. It allows allocating resources for a group of processes and almost completely isolates them in terms of file system, network, etc.</p>
<h3 id="heading-trade-offs-4">Trade-offs</h3>
<p>The main trade-off of containers is the abstraction itself. It adds more complexity to the whole workflow.</p>
<p>Such complexity results in:</p>
<ul>
<li><p>Higher resource usage. Creating a container requires more resources than making a simple control group.</p>
</li>
<li><p>Performance problems in high-performance applications. As a result of high resource consumption, you can run into performance issues.</p>
</li>
<li><p>Learning curve.</p>
</li>
</ul>
<h3 id="heading-when-to-use-4">When to use</h3>
<p>Despite container trade-offs in particular cases, it is still the best solution for resource management we have so far. Here are just a few benefits that you get by using them:</p>
<ul>
<li><p>Granular control. They heavily restrict resource usage (using control groups for it). Each container is isolated from one another, making it harder to break things up.</p>
</li>
<li><p>Better tooling. Tooling around containers allows you to configure containers for specific needs easily. Moreover, If you’re using some IDE with tools like Docker, it will hint you commands that you can use and highlight the ones that you can’t, making the experience even better.</p>
</li>
<li><p>Great isolation<strong>.</strong> Containers provide a great isolation for applications running inside of them.</p>
</li>
</ul>
<h2 id="heading-conclusion">Conclusion</h2>
<p>There are many options for resource management of Node.js applications. It all comes down to understanding the trade-offs of each approach and your specific needs.</p>
<p>In general, I would stick with containers whenever possible. Even if you don't know about them much, it is a great opportunity to learn.</p>
]]></content:encoded></item></channel></rss>