Serializing promises and other asynchronous wonders of ECMAScript

Published on Wednesday, September 20th 2017 at 21:15
Last updated on Thursday, March 21st 2024 at 15:58

We all know it: ECMAScript is ascending to the olympus of development at an impressive pace. Many could argue that the rise has reached its peak and a fall will consequently be inevitable, but that’s just how everything works. This language, together with all its ancillary technologies, is - hands down - the most used development stack in the world, by quite a long shot. You can see it for yourself just by looking at this survey from Q1/2017, where JavaScript beats everything else in popularity both on github and on stackoverflow.

With the growing tide of success, more and more people are using the language for the most diverse purposes. Everyone and their grandmas know JS, if not even a tiny bit. Nonetheless, JS ships with a quite peculiar programming paradigm due to its intrinsic asynchronous nature. Its best point of strength is also the largest source of concern for the majority of its users. While humans appreciate asynchronicity, they do not reason with such concept in mind all the time, thus making it exceptionally easy to incur in trivial but harmful mistakes. In particular, one of the most common issue I deal with every day on my workplace is the promise serialization problem.

From callbacks to promises to async/wait

I am aware there are a lot of articles^{1, 2} around that talk about the differences between callbacks, promises and async/wait. So I’ll just have a quick roundabout to refresh your memory on the topic and then proceed to enquire about the matter at hand, i.e. serialization of asynchronous tasks.

In 2009 we had callbacks. When an asynchronous task is invoked, the implementer may decide to provide a callback parameter in the function signature. A callback is simply a function that will be called by the library method once the task has been completed. So, you would typically have a code similar to this one:

Example of callbacks from a video crawler

request(videoUrl, (error, response, body) => {
    if (error)
        return console.error('Cannot download YouTube video description:', error);

    const $ = cheerio.load(body);
    const $watchDescription = $('#watch-description');
    const videoDescription = $watchDescription.text();

    if (!videoDescription)
        return console.error('Cannot find video description on YouTube: maybe the page structure changed?');

    let match;

    while (match = TRACK_REGEXP.exec(videoDescription)) {
        let position = match[1];
        let title = match[3].replace(/\s+/, ' ');
        videoTracks.push({ position, title });
    }

    console.log('Found', videoTracks.length, 'tracks inside this video');
});

Of course, the more logical dependencies are in place among the various bits of your code, the more you will need those tasks to be serialized, i.e. to be executed in a specific order. Serialization in the callback world is achieved through invocation nesting, which is just a fancier and more refined term to refer to the notorious callback hell. Indeed, you have to wait for an activity to terminate before hopping onto the next one; this implies that the code tasked with starting a logically dependant action must reside within the asynchronous callback. You may have to repeat the same process over and over again, as long as the dependency chain goes, achieving something like this:

Around 2011, people noticed these patterns were disturbing. Developers were starting to kill themselves because of the terrible nightmares the callback hell inflicted upon them. That was the time promises began to get popular, thanks to the CommonJS’ A proposal and Kris Kowal’s Q implementation.

Every time you indent a piece of JavaScript, God kills a kitten.

Promises changed the way developers look at asynchronicity, from the implementation’s perspective. A promise is a functor that wraps a task implementation with an asynchronous envelope: when it is created, the task itself starts and it promises you that its code will be executed and eventually reach an halt, either with a success or with a failure. Having an handle to the promise object allows you to dictate what needs to be done in case of resolution or rejection.

How a promise is usually created

const promise = new Promise((resolve, reject) => {
    // Do some hard work...
    doAllTheThings();
    
    // If the activity succeeded, call:
    resolve(results);
    
    // If something went wrong, call:
    reject(error);
});

Developers are thus uncertain fortunetellers that try to look beyond the event horizon of the foreseeable future and express what will need to occur in each possible state. Isn’t it a thrilling experience??

Example of promises from the api of a data monitoring tool

app.post('/:namespace/:type/', (req, res) => {
    const object = req.body;
    if (!object)
        return res.send({ success: false, message: 'Invalid or missing body'}).status(400).end();

    const dist = new ObjectDistribution(req.params.namespace, req.params.type);
    dist.add(object);

    // explain() returns a promise
    dist.explain(object)
        .then((explanation) => {
            const baseline = dist.baseline();
            return { baseline, explanation };
        })
        .then((bundle) => {
            const analysisResult = checkThresholds(bundle.baseline, object, bundle.explanation);
            res.send({ analysisResult, explanation: bundle.explanation }).end()
        });
});

I’ll go ahead and skip the generators era because I think that’s essentially a hack: bending a tool used for generating sequences to be the milestone of asynchronous behaviours… it is not what generators were meant for, so let’s fast-forward to 2017. With the introduction of async/wait in the ECMAScript 2017 standard we have a new, nicer way to approach the problem. Well, to be honest, it is not an entirely novel approach, because async/await is just syntactic sugar over existing promises… but I for myself love SUGAR! And I am confident you love it too, eh? I saw you eating that whole cake last week, don’t pretend you didn’t! The inconspicuous and bothersome way to handle asynchronous execution becomes now a lot more similar to classical, synchronous, imperative programming; with the new C#-inspired syntax allowing us to write something that closely resembles the familiar, warm and welcoming aegis of sequential code!

Example of async/await from the same tool

async percentageOf (value) {
    const total = await this.total();
    if (total === 0)
        return 0;

    value = this._normalizeValue(value);

    const count = await this.counterStorage.get(this._buildMetricKey(value));
    return count / total;
}

In the example above, both this.total() and counterStorage.get() return promises that resolve when the proper result is obtained: they are wrappers on lower level redis calls, which work asynchronously by design. Despite requiring the function body to halt twice and wait for results, I didn’t have to introduce unnecessary scope pollution with then chains or callbacks. The code looks clean and neat, as if the access to external storage services were just a plain old invocation to a method, except of course for the await keyword.

In order to use async/await, you just need to memorize three rules:

await can only be used in the context of an async function, i.e. a function (either classic or fat-arrow) declared using the async keyword;
await requires one (and only one) promise as an argument and halts the current execution flow until the promise is either resolved or rejected. In the first case, it returns the result that the promise itself returned through resolve. Otherwise, it throws a normal error that can be caught with a simple try block;
async functions always return a promise. If you return something that isn’t a promise, the ES runtime will wrap it into a promise that resolves with that value. If you don’t return anything, it will simply yield a promise that resolves with no value.

That is why you always have to await async functions! In hindsight, one could say that async functions are like women!

Serializing promises

Have you ever wondered where the hell does the code go when it gets swallowed by a promise? After all, ES is single-threaded… and it is also asynchronous! How does that work out? The answer is as simple as surprising: procrastination. That’s essentially the core of ES: no wonder that developers are so damn lazy all the time! Both the browser and node.js maintain an event queue, with a single consumer thread that is constantly looping over it and checking if there is something new to execute. In hindsight, you can acknowledge that promises and callbacks are essentially “code to be postponed” for later execution. Depending on the technology that you are currently using, the ES “scheduling” service may differ: browsers usually have several different threads dedicated to specific operations, generally I/O, web workers, rendering, etc… Node as well has a thread pool with various working units, mainly for I/O. The event queue is slightly different in the two cases, with variations in how and when timeouts, immediates and I/O are handled. However, the concept is essentially the same.

Callbacks are thus stored in the heap and when the associated asynchronous operation completes outside of the boundary of the ES runtime, the proper callback is unearthed from the memory map and put in the event queue. The consumer thread then notices that there’s a new guest on the top of the queue, polls it and finally places it on the stack, where normal execution resumes in the classical fashion. The same goes for timers.

Now, given this scheduling model, it is apparent that determining when a promise will resolve (or when a callback will be invoked) is far from trivial. As a matter of fact, more often than not, you will encounter odd side effects due to objects in function closures mutating their inner state while the code was halted in a waiting. Discerning between immediate and lazy evaluation too may take a good deal of thought. So here’s several common patterns used to serialize asynchronous bits of code.

Note: for the following examples, I will provide code compliant to node.js 6+ and its promise implementation.

Promise chains

While it is easy to chain several promises with then() when their number is known apriori, I have noticed several times that people struggle when they deal with a collection of promises of arbitrary length. In this scenario, the most immediate thing a developer could do, albeit not the most elegant one, is to chain them one after the other by calling then() and storing its return value into a temporary variable that will be used as a fixed point in the loop. Let me exemplify it with an example:

Utility class we will use to generate promisesview raw

'use strict';

/**
 * Creates and returns a promise that wraps a delayed execution of f after time t.
 * @param f {Function} A function that will be called after the given time has elapsed.
 * @param t {Number} A delay expressed in milliseconds after which the given function will be called.
 */
const delay = (f, t) => new Promise((resolve, reject) => {
    // Assume for simplicity that f and t are of the right type and contain valid objects
    const task = () => {
        try {
            resolve(f());
        } catch (e) {
            reject(e);
        }
    };

    setTimeout(task, Math.max(1, t));
});

/**
 * Generator of simple tasks taking a random time to complete.
 */
class TaskGenerator {
    /**
     * Initializes the task generator with a fixed maximum count of created tasks.
     * @param maxCount {Number}
     */
    constructor (maxCount) {
        this.counter = 0;
        this.maxCount = maxCount;
    }

    /**
     * Generates a sequence of random tasks expressed as promises. Each task returns as result
     * the sequential index it was created from and takes a short random time to complete.
     * @return {Iterable.<Promise>}
     */
    *generate () {
        this.counter = 0;

        // Fun fact: this.counter++ makes the syntax highlighter crash lol
        while (++this.counter <= this.maxCount) {
            // This is a very important line!! Read my comments below for further details
            const frozenCounter = this.counter;
            const randomTime = Math.floor(Math.random() * 1000);
            yield delay(() => frozenCounter, randomTime);
        }
    }
}

module.exports = TaskGenerator;

Example 0: normal promisesview raw

'use strict';

const TaskGenerator = require('./task-generator');
const taskGenerator = new TaskGenerator(5);

// The following code generates 5 tasks and executes a console.log after each one of them completes.
// Unfortunately, although they are generated in the correct order, they do not complete the same way!
// So you will see a different sequence of printed numbers each time you run this code!
for (const p of taskGenerator.generate())
    p.then((result) => console.log(result));

// This is a just a sort of "milestone" we will use as reference to distinguish various types of serialization
console.log(6);

If you run the code above, you will see numbers appear in different order each time, because, when promises are yielded from the generator, their inner task starts right away; since each of them is chained with a different then(), each run will take its own course and terminate chaotically. Additionally, the final 6 will be most likely printed first (but in theory it could end up in any spot of the sequence):

Also, notice that in generate() we are saving the counter in a local constant! We have to do this because if we just referred to this.counter within the function passed to delay(), at the time when it is executed, it would most likely always yield the same value (maxCount). These mistakes are the hardest to spot despite their triviality! Instead, by using a new constant, we encase the current value into a closure whose scope will persist unaltered through time until the target function is executed! Black magic huh?

Now, a simple solution to this problem would be to reformulate the loop like this:

Example 1: chained promisesview raw

'use strict';

const TaskGenerator = require('./task-generator');
const taskGenerator = new TaskGenerator(5);

let current = null;

for (const p of taskGenerator.generate())
    current = (current ? current.then(() => p) : p).then((result) => console.log(result));

console.log(6);

The code above simply stores a reference to the last executed promise in the current variable. In the next iteration we can chain the next promise with the last one instead of just letting it run free. As a result, every time you run the program, you will always see the same sequence of numbers printed in ascending order from 1 to 5. Beware that we are not waiting for a promise to finish before issuing another one. They are spawned exactly in the same way as before. What is different here is essentially the order in which we decide to use their return values, which is bound to always follow the same pattern.

The final 6 will be, again, most likely in the first place but not necessarily so. If it was moved for some reason, it could appear in another place of the sequence, but regardless of its position, all other numbers will always be printed in the correct order.

Promise.all

This is a function from the promise standard, provided in all implementations. It accepts an iterable of promises and returns a promise that resolves only when all of the passed arguments have been resolved; alternatively it rejects if at least one of the arguments rejected. With respect to the previous example, invoking Promise.all is similar to the second loop, as every promise completes on its own and results are appropriately reordered by all itself.

Example 2: Promise.allview raw

'use strict';

const TaskGenerator = require('./task-generator');
const taskGenerator = new TaskGenerator(5);

Promise.all(taskGenerator.generate())
    .then(results => console.log(results));

console.log(6);

The output will look like this:

1 2	6 [ 1, 2, 3, 4, 5 ]

In fact, Promise.all will resolve with an array of values taken from the single results from each of the used promises. Also notice that the array contains those values in the same order as the promises were iterated over. Our mystical 6 is still in the same spot, and I guess you already know why by now.

await

Now let’s try the new syntactic sugar of ES2017 with the famous pair async/wait:

Example 3: async/waitview raw

'use strict';

const TaskGenerator = require('./task-generator');
const taskGenerator = new TaskGenerator(5);

const doWork = async () => {
    for (const p of taskGenerator.generate()) {
        const result = await p;
        console.log(result);
    }

    console.log(6);
};

doWork();

Finally! We made it! Thanks to the power of await we are able to neatly pause the execution until each promise resolves. This code is now able to print all the numbers in the correct order while still being asynchronous, all with the most elegant and nice-looking syntax! It looks too good to be true, right? Indeed, it’s too good. There’s one huge drawback to this code, which is also its main distinguishing trait: it halts for each promise. While in the other examples we were able to let the promises run “concurrently” (forgive me for the improper terminology here), in the latter case we have to wait that random time at every iteration, since we cannot go on with the loop (and thus generate a new promise) until we obtain the results of the current one.

Fear not, my dear friends! I have a potent remedy! by combining Promise.all with await we can obtain the best of both worlds:

Example 4: async/wait + Promise.allview raw

'use strict';

const TaskGenerator = require('./task-generator');
const taskGenerator = new TaskGenerator(5);

const doWork = async () => {
    const results = await Promise.all(taskGenerator.generate());
    console.log(results);
    console.log(6);
};

doWork();

Other serialization methods

Apart from Promise.all, the standard promise library also exposes another utility function called Promise.race, which is the complementary operation of the former. Race returns a promise the resolves/rejects as soon as the first promise in its arguments resolves/rejects. As you can seen, given its greedy nature, it isn’t as useful for serialization tasks, so I left it as a side node.

Besides the standard promise library, there are more community-maintained libraries like bluebird, which offers a great deal of flexibility and an astonishing amount of ready-to-use operations. Discussing bluebird in depth is outside of the scope of this article but it is important to know that alternatives do exist and they may offer you more valid alternatives to specific problems, depending on the purpose of your code.