Asynchronous programming – breaking the illusions of blocking code

Over the past couple of weeks, I’ve been forced to confront the asynchronous programming style. Not out of choice, but because it was the only interface provided by a third-party library that needed to be integrated. Now before I begin, let me stress that asynchronous does not mean parallel programming. If this were parallel programming, I could try wrapping the offending code in some special context and most problems would go away. No, asynchronous is actually single threaded execution pretending to be parallel.

Let’s start off with a simple code snippet to understand this. Have a look at the pseudo-code below;

for (i = 0; i < 3; i++ {
    print("current loop=" i);
}

The output of the above program would look like this;

current loop=0
current loop=1
current loop=2

For the most part, the flow of execution is straight-forward enough, isn’t it? Now, remember that 3rd-party library that I needed to work with? Well, that was an asynchronous call that returned a value that needed to be used inside the for-loop. Naively, the following attempt at integration was made;

for (i = 0; i < 3; i++ {
    async_print("current loop=" i);
}

The result baffled me. Here’s what I got;

current loop=3
current loop=3
current loop=3

Which didn’t make any goddamn sense! … at first. I spent a considerable amount of time searching for answers. In the end, it came. The problem was that I didn’t quite understand the flow of program execution.

Coming from a synchronous programming, I assumed that asynchronous calls were some fancy way of parallel programming. They aren’t. I also assumed that the async function call would be blocking (i.e. the program would wait till the function/method finished executing). That wasn’t quite the correct answer either.

You see, what happens with asynchronous statements (at least the ones I’ve encountered), are something what might be called as “deferred execution”. When an asynchronous statement is reached, the control flow does not execute it. Instead, the statement is put on a queue to be executed later. Once the current context of program statements are finished, items in the queue are taken out in a FIFO manner and executed.

So, how does this explain the earlier for-loop? Within each loop, when the async call is made, its put on the queue. Now, when will the loop be over with? When the value of i is equal to 3. It is in this kind of context that the async calls get serviced. So, even though they are called three times, the value of i for each of those calls is 3 – thus producing our strange result.

I’m probably missing out some important details, but from an understanding point of view, this know-how was critical to help me solve my problem. The solution I resorted to was recursion. Here’s how the ‘fixed’ pseudo-code looks like;

i = 0;
function (i)
{
   async_print("current loop=" i );
   i++;
   if (i < 3)
      call function (i);
}

Things might seem a bit stretched out, but the key concept that needs to be understood, is that now every call to function() happens in a different context. This means that even if the code flows by skipping the async_print() function initially, when it returns to service the call, the context to which it returns is the same. Thus giving us this output;

current loop=0
current loop=1
current loop=2

Hmm… this would probably be better if I could demonstrate some actual code instead of this pseudo-stuff. Let me see if I can rig up something …

Advertisements