Introduction to handle_continue in Elixir (and when to use it)

The handle_continue/2 callback prevents race conditions and allows for faster, asynchronous initialization.

Let’s start by looking at the problems that handle_continue solves. If you don’t care about the problems and just want the code, you can skip to the end or checkout (pun intended) the GitHub repo.

Here is a short, all-in-one example that shows an application which starts three instances of a process, each of which load different data when they start up:

An application with three processes which are started synchronously, one after another.

The supervisor iterates through its list of children, calling each child’s init/1 callback. This is done synchronously, one child after another. Since we are performing a (fake) HTTP request to fetch data for our processes’ state, this is kind of slow, and would become even slower with every child process we add:

Image for post
Image for post
The child processes starting one-after-another

Since the processes don’t depend on each other, it would be nice if we could start them up all at the same time, instead of waiting ~9 seconds for them all to initialize sequentially.

A common “trick” that people use for asynchronously initializing a process is to have that process send itself a message using self/0 (which returns the process’ pid) and then either Kernel.send/2, Process.send/3, or Process.send_after/4.

Let’s modify our init/1 callback to defer the HTTP call and perform it asynchronously, so that the init/1 function can return faster, and the supervisor can move on to the next child sooner:

An example of performing asynchronous initialize of a process. Complete code here.

Now when we start our application, everything is initialized a lot faster because the HTTP calls are no longer being performed in the init/1 callback:

Image for post
Image for post
A faster overall startup time, since the init callbacks are now delegating their HTTP calls

This seems great: we have decreased our startup (or restart) time by taking slow code out of our init/1 callback, everything looks okay.

But there is a problem; lets take a look at another example.

We will introduce a new process, Spammer, which is constantly trying to send messages to the MyServer processes. In this example it is using GenServer.cast/2 to represent any other messages that might be sent in a real application. The MyServer processes will process these messages via a new handle_cast/2 callback:

New Spammer process which sends messages to the other processes. Complete code here.

In the application’s start/2 function we setup the supervisor. To start the Spammer child, there are no arguments, so we just specify the module name. We are placing/starting the Spammer before the MyServer processes because this illustrates what could happen in crash-restart situations.

The application’s start function. Complete code here.

When we run this, we get an error:

Image for post
Image for post
Race condition! The increment message arrived before the data was fetched

Looking at the logs, we can see that the increment message arrived before the data was fetched, and the process crashed because we were expecting data to be a map, but it was still nil.

Now is a good time to highlight something that we have just demonstrated: sending yourself a message in the init/1 callback does not mean that it will be the first message in the mailbox.

This means that it is pretty easy to introduce a race condition when, for example, you are sending messages by name (and not pid). This can happen on startup (as we just demonstrated) but could also happen anytime the MyServer process is restarted.

Additionally, even if you could significantly improve the performance of your HTTP request, or removed it all together, you would still get this race condition because the Spammer has the opportunity to beat init/1 to the mailbox.

Now that we have seen some problems, lets look at some solutions.

One solution would be to not use named-processes, and to use a Registry instead. The processes could self-register, asynchronously, after they had fetched the data that they needed, and it would be impossible to send them a message before then.

For the problems we looked at, an easier solution is to use the handle_continue callback which was introduced in OTP 21, and guarantees that the process will not accept any messages until the callback is finished. This means that we can still have our asynchronous start up, without having to worry about other messages being processed first.

Here is an example of using handle_continue:

An example of using handle_continue for performing asynchronous initialize of a process. Complete code here.

When we start the application supervisor with the a Spammer child and the MyServer children now, we will no longer receive an increment message before the data is loaded. As soon as one of the MyServer processes finishes its handle_continue/2 it will start processing the :increment messages:

Image for post
Image for post
Using handle_continue, the messages are now processed in the correct order

In summary:

  • Child processes are started one-after-another, and doing slow initialization in a process’ init/1 callback will make the whole supervision tree slow to start and restart
  • Making a process send itself a message for initialization speeds up startup time (and restart time), but is prone to race conditions
  • Having a process send itself a message in init/1 does not guarantee that will be the first message in the mailbox
  • Using handle_continue allows for asynchronous startup (and restart), and guarantees that the process won’t start processing any other messages first
  • Complete code examples on GitHub

Software engineer at PagerDuty, working with Scala and Elixir.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store