Using a GenServer to handle asynchronous and concurrent tasks
February 01, 2019
In most cases I have found inter-process communication to be an unnecessary overhead for the work I have been doing. Although Elixir is known for this (along with Erlang), it really depends on what you’re trying to achieve and processes shouldn’t be spawned just for the fun of it. I have recently come across a scenario where I thought having a separate process be responsible for performing concurrent and asynchronous jobs would be the best way to approach the problem. In this article I will explain the problem and the solution.
Requirements
The goal of this work was to asynchronously handle requests to move static assets from one provider to another. This means downloading the original to a temporary file on the server, then uploading it to the new provider and saving results in a database.
- A GraphQL mutation needs to trigger this asynchronous job and not block the response.
- When the job completes, either successfully or with a failure, we should report it or handle it in some way.
- Multiple requests will come through concurrently, meaning the process shouldn’t be blocked from handling another request because one is still running.
- A request may trigger one or many jobs
The process of finding a solution
There are many different options for structuring your Elixir applications in terms of the supervision tree - when and where to spawn processes and which type of process suits your use case is often a guessing game until you’ve used them all before extensively.
My first thought was to use a DynamicSupervisor (i.e Task.Supervisor) and specifically create new supervised processes when the work needed to be done, and on demand.
This didn’t really work how I thought it would because the main process would still block until all the tasks were finished before responding to the initial request.
The next solution I tried was to send messages to a GenServer, and have it do the work so that the main process could return a response almost immediately. While this got most of the way to solving the problem, a common problem found with using GenServers is that they can only handle one message at a time, so while this solution provides the asynchronous behaviour, it loses the benefit of concurrency.
The solution that (seems to work so far) I ended up going with wasn’t too far away from the Genserver solution. The only difference being when we schedule a job to be done, it only spawns a Task with Task.async/1, the benefit of which is that it will always send a message back to the caller when it’s finished even if you don’t use Task.await/2.
As it is a GenServer that is spawning these tasks, it can handle generic messages sent to it quite easily with the handle_info/2 callback. This is where the GenServer handles success or failure states of each task, and processing each result synchronously is not a problem in this case.
Here's a snippet of the GenServer that spawns these Task processes.
defmodule TaskRunner do
use GenServer
@me __MODULE__
def start_link(opts) do
GenServer.start_link(@me, opts, name: @me)
end
def init(opts), do: {:ok, opts}
def run(fun) do
GenServer.cast(@me, {:run, fun})
end
def handle_cast({:run, fun}, state) do
Task.async(fun) # sends a message back to the TaskRunner when completed
{:noreply, state}
end
# handle_info/2 receives generic messages from the Task processes
def handle_info({_task, {:ok, result}}, state) do
Logger.info("#{inspect(result)} Job Done.")
{:noreply, state}
end
def handle_info({_task, {:error, reason}}, state) do
Logger.error("Failed to completed job: #{reason}")
{:noreply, state}
end
def handle_info(_, state), do: {:noreply, state}
end
What's interesting about this code is that it may actually be reimplementing something that already exists in Elixir, that I haven't quite got my head around yet - either way I haven't got a problem with doing it this way as long as it works! Wrapping the spawning of a Task in a GenServer simply provides the ability to "schedule" tasks (as each message is processed sequentially), while responding to the response from each task invidiually.
In theory if we were to send a bunch of messages that get "queued" for processing in the GenServer's mailbox, a problem may arise where if the application terminates, the GenServer will lose all of it's messages and those tasks will be lost. At this point, however, I would prefer to see how much of a problem this turns out to be as there would be various factors to consider.
I’m still not sure if this is going to be the best way to architect this asynchronous, concurrent behaviour, but in the few cases where I’ve thought an OTP approach makes sense I have often found many different ways to solve this kind of problem - which is both a good and bad part of Elixir.
February 29, 2024
A common question in software engineering interviews is how can you speed up a slow query? In this post I want to explain one answer to this question, which is: to add an index to the table the query is performed on.
February 12, 2024
I spend most of my time thinking about performance improvements. Refactoring is tricky work, even more so when you’re unfamiliar with the feature or part of the codebase.
May 31, 2023
Asynchronous programming is a foundational building block for scaling web applications due to the increasing need to do more in each web request. A typical example of this is sending an email as part of a request.
April 01, 2022
I have mixed feelings about feature flags. They are part of the product development workflow and you would be hard pressed to find a product engineering team that doesn’t use them. Gone are the days of either shipping and hoping the code will work first time or testing the life out of a feature so much that it delays the project.
March 18, 2022
When I first started interviewing candidates for engineering roles, I was very nervous. The process can be quite daunting as both an interviewer and interviewee. The goal for the interviewer is to assess the candidate for their technical capabilities and make a judgement on whether you think they should move to the next round (there’s always a next round). Making a judgement on someone after an hour, sometimes a bit longer, is hard and error prone.
June 03, 2020
Dependency Injection is the method of passing objects to another (usually during instantiation) to invert the dependency created when you use an object. A Container is often used as a collection of the objects used in your system, to achieve separation between usage and instantiation.
April 17, 2020
Working from home has been thrust upon those lucky enough to still have a job. Many aren’t sure how to cope, some are trying to find ways to help them through the day. Make no mistake, this is not a normal remote working environment we find ourselves in, but nonetheless we should find ways to embrace it.
April 14, 2020
One of the most useful tips that has guided much of my decision over the years has been this simple principle: three steps, executed in sequential order;
October 24, 2019
Code Reviews are one of the easiest ways to help your team-mates. There are a number of benefits for both the reviewer and pull request author:
September 12, 2019
It’s been a while since I last wrote about why testing is important, but in this post I thought I would expand on that and talk about why not only unit testing is important, but how a full spectrum of automated tests can improve productivity, increase confidence pushing code and help keep users happy.
July 05, 2019
Design Patterns allow you to create abstractions that decouple sections of a codebase with the purpose of making a change to the code later a much easier process.
May 03, 2019
Umbrella apps are big projects that contain multiple mix projects. Using umbrella apps feels more like getting poked in the eye from an actual umbrella.
April 14, 2019
Ever get the feeling that adding this "one little hack", a couple of lines of code, won't have much of an impact on the rest of the codebase? You think nothing of it and add it, convincing your team members it was the correct decision to get this new feature over the line. In theory, and generally speaking, I would kind of agree with doing it, but every hack is different so it's hard to paint them all with the same brush. If you've been doing software development for long enough you can see this kind of code coming from a mile away. It's the kind of code that can haunt your dreams if you're not careful.
March 04, 2019
Last week was Lonestar ElixirConf 2019 held in Austin, Texas. The conference ran over 2 days and was the first Elixir conference I had been to.
February 01, 2019
In most cases I have found inter-process communication to be an unnecessary overhead for the work I have been doing. Although Elixir is known for this (along with Erlang), it really depends on what you’re trying to achieve and processes shouldn’t be spawned just for the fun of it. I have recently come across a scenario where I thought having a separate process be responsible for performing concurrent and asynchronous jobs would be the best way to approach the problem. In this article I will explain the problem and the solution.
December 19, 2018
When we think about what an application does, it's typical to think of how it behaves in context of its dependencies. For example, we could say a ficticious application sync's data with a third-party CRM.
November 20, 2018
When you're browsing your way through Elixir documentation or reading blog posts (like this one), there's no doubt you'll come across a GenServer. It is perhaps one of the most overused modules in the Elixir standard library, simply because it's a good teaching tool for abstractions around processes. It can be confusing though, to know when to reach for your friendly, neighbourhood GenServer.
October 30, 2018
Typically in an application with a database, you might have more records than you can fit on a page or in a single result set from a query. When you or your users want to retrieve the next page of results, two common options for paginating data include:
September 26, 2018
Protocols are a way to implement polymorphism in Elixir. We can use it to apply a function to multiple object types or structured data types, which are specific to the object itself. There are two steps; defining a protocol in the form of function(s), and one or many implementations for that protocol.
August 23, 2018
Recently, I've been writing a tonne of Elixir code, some Phoenix websites and a few other small Elixir applications. One thing that was bugging me every time I would create a new project is that I would want to add Docker to it either straight away because I knew there would be a dependency on Redis or Postgres etc, or halfway through a project and it would really slow down the speed at which I could hack something together.
July 26, 2018
While writing Understanding Concurrency in Elixir I started to grasp processes more than I have before. Working with them more closely has strengthened the concepts in my own mind.
July 14, 2018
Concurrency in Elixir is a big selling point for the language, but what does it really mean for the code that we write in Elixir? It all comes down to Processes. Thanks to the Erlang Virtual Machine, upon which Elixir is built, we can create process threads that aren't actual processes on your machine, but in the Erlang VM. This means that in an Elixir application we can create thousands of Erlang processes without the application skipping a beat.
July 06, 2018
Ecto is an Elixir library, which allows you to define schemas that map to database tables. It's a super light weight ORM, (Object-Relational Mapper) that allows you to define structs to represent data.
June 27, 2018
We often think about Streaming as being the way we watch multimedia content such as video/audio. We press play and the content is bufferred and starts sending data over the wire. The client receiving the data will handle those packets and show the content, while at the same time requesting more data. Streaming has allowed us to consume large media content types such as tv shows or movies over the internet.
June 06, 2018
A Queue is a collection data structure, which uses the FIFO (First In, First Out) method. This means that when you add items to a queue, often called enqueuing, the item takes its place at the end of the queue. When you dequeue an item, we remove the item from the front of the queue.
March 23, 2018
Elixir is a functional language, so it’s no surprise that one of the main building blocks of the request-response cycle is the humble Plug. A Plug will take connection struct (see Plug.Conn) and return a new struct of the same type. It is this concept that allows you to join multiple plugs together, each with their own transformation on a Conn struct.
February 06, 2018
A Supervision Tree in Elixir has quite a number of parallels to how developers using React think about a component tree. In this article I will attempt to describe parallel concepts between the two - and if you've used React and are interested in functional programming, it might prompt you to take a look at Elixir.
December 21, 2017
Technical debt is a potentially crippling disease that can take over your codebase without much warning. One day, you’re building features, the next, you struggle to untangle the mess you (or maybe your team) has created.
August 15, 2017
Before being introduced to Elixir, a functional programming language built on top of Erlang, I had no idea what pattern matching was. Hopefully, by the end of this article you will have at least a rudimentary understanding of how awesome it is.
January 06, 2017
Elixir is a functional programming language based on Erlang. I’m told it’s very similar to Ruby, with a few tweaks and improvements to the developer experience and language syntax.
November 29, 2016
Unit testing can sometimes be a tricky subject no matter what language you’re writing in. There’s a few reasons for this: