Unrelated-to-this-talk message: next djangocon.eu will be in Zürich, organized by divio.
Why use a task queue? If a user clicks something, he has to wait. Adding a comment means checking for spam. Uploading an image means generating thumbnails. Ordering a product means processing payments and sending emails.
You can use a cronjob for recurring tasks. But you have to wait for the cronjob to come along and sometimes the cronjob might not have anything to do.
A task queue can help! You can decouple information producers and consumers. Asynchronous processing might mean a speed increase. Scalability can improve. Celery is such a task queue. Written in python.
In celery, clients connect to a broker which gives tasks to workers. So both clients and workers are scalable. Celery supports synchronous, asynchronous and scheduled tasks. As a broker it recomments RabbitMQ. RabbitMQ is written in Erlang and it has clustering support if you need it.
Some of celery’s features: serialization to pickle, json or custom code. Tasks can be grouped into sets and subtasks. You can retry failed tasks. And there’s routing (on some brokers), so you can configure which queues get used for what. Results, when finished, can be stored in various “result stores”. Setting up logging for your tasks is simple. And… it integrates well with python web frameworks (django, flask, pylons).
Celery originally could only work with django (and its orm), now it switched to sqlalchemy and that made it usable outside of django. Django integration is provided through django-celery.
(He then showed some code examples. I’m not typing that over :-) )
A link: http://pypi.python.org/pypi/celerymon for monitoring your celery queue.
My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):