The Layer Index and Asynchronous Task Queues in Django
Greetings!
I’ve been working full-time on the Yocto Project for a couple weeks now, and I’d like to talk about what I’ve been up to so far.
First, what is Yocto? It’s a collection of FOSS projects for building Linux distributions for embedded systems, like RasberryPi boards. The distributions you create using tools like bitbake can then be imaged onto hardware, allowing you to (for example) quickly prepare routers with the necessary drivers and networking software to create a mesh network. Outside of my internship, I hope to develop a production pipeline for inexpensive mesh nodes, somewhat like PirateBox, and tools like bitbake will be a big part of that effort.
My mentor, Libertad Cruz, showed me a handful of bugs on a project called layerindex-web which indexes layers and recipes on top of OE-Core. For example, there’s the meta-rasberrypi layer which provides support for developing distributions for RasberryPi boards. A recipe might be a distribution you can further customize, or instructions for including certain software in a distribution.
layerindex-web is a Django application that allows users to submit layers. The application uses git repo addresses to infer information about a layer, including its recipes and branches. An interesting feature of the site is that maintainers have editing rights over the things they maintain, but a maintainer doesn’t need an account on the site to be listed as a maintainer. Thus, non-maintainers can submit layers to the index, and the maintainer can create an account later in order to make any necessary edits, should they be so inclined. All layers require admin sign-off before they are published to the index.
The first issue I chose to tackle requested the implementation of an asynchronous task runner, so that tasks like sending emails or fetching layer information don’t block page rendering. In some cases, this means requests might time out for taking too long, or if they error out, the page would fail to load. I chose to use RabbitMQ and Celery to do the job. RabbitMQ is a message broker, like a specialized kind of database, which runs outside of Django. Celery is a distributed task queue written in Python. It runs inside Django, and communicates with RabbitMQ to manage the queue and execute tasks asynchronously.
Enough background. Here’s how I implemented an asynchronous task queue in Python 3.5 and Django 1.8:
Installing RabbitMQ
On systems with apt-get like Ubuntu, do this:
| 
 | 
 | 
This installs the RabbitMQ server, and starts the server daemon with some default settings. See Configuring RabbitMQ for more information about configurating settings. Within development and QA, the default settings might be sufficient for you. They are for me.
If you use Docker or Vagrant or Chef or some other kind of automated provisioning tool, you’ll need to add instructions to include RabbitMQ in your build. You can find more information, including recipes, in RabbitMQ’s installation guides.
Then, you’ll need to include connection information in your application’s settings.py. Here’s how I did it:
| 
 | 
 | 
Those reflect RabbitMQ’s default settings. The broker is the location of the message broker (RabbitMQ in this case), the backend is the protocol used to send messages to the broker. Celery will know what to do with these settings.
Using Celery
Use pip to install Celery, and add it to your requirements.txt file:
| 
 | 
 | 
Now create a tasks.py file in your project. The name is arbitrary, I just find the name appropriate for, y’know, writing tasks. Paste something like this as a starting point:
| 
 | 
 | 
From another file, such as views.py, you can import the tasks object and use it in your views. For example:
| 
 | 
 | 
That view processes a submitted layer, validating the relevant forms and saving new objects before emailing admins. When you call tasks.send_email, Celery communicates with RabbitMQ to create a task for sending the emails. Then, as it consumes tasks from the queue, it sends the emails. This allows you to redirect the user that submitted the layer appropriately (ex: to a “thank you” page) without waiting for the emails to send. If you sent them synchronously, then any error while sending (such as a defunct email address) would show up for the user as a 500 error. Sending them asynchronously, you could instead log the error or take corrective action without holding up the user.
Conclusion
That’s it! You’ve set up an asynchronous task queue in Django. I hope you found these instructions helpful!