Taming multiple databases with Django - Marek Stępniowski

Tags: django, djangocon

Marek works at SetJam: “We came to Django for the views, but stayed for the ORM”. Django’s ORM is pretty much in the sweet spot. SQLalchemy in comparison is less nice, having to learn a non-sql, non-pythonic language.

At SetJam, they have what they call a backend and frontend. The backend collects data and stores it in the database, the frontend spits it out, mostly via feeds.

They started out with one single big database, but that was hard to optimize. Many backend servers would write to the same database and the frontend server would read from it. Hard to optimize.

Next they added a database slave for reading. That was before Django’s multi-db support, so they had if/elses in their settings files based on environment variables.

After Django’s multi-db support, they could really support two databases and refer to them in the code with 'DEFAULT' and 'SLAVE'.

Later on they splitted up the database even more. What goes where is handled by two custom database routers: a “MasterSlaveRouter” for the master/slave distinction and an “AppRouter” for shuffling some apps’ data to certain databases.

Tip: look at https://github.com/jbalogh/django-multidb-router, especially for the handy decorators (@use_master, for instance) it provides.

At a moment they had problems with Django’s transaction decorators: they only work with the default database. They had to call the actual code and pass it the right database.

Similarly, South doesn’t work very automatically with multiple databases. South’s ticket #370 is still open after three years. He hopes he can get a fix into the new south-in-the-django-core code.

He showed a code example that looked pretty OK. Then he showed what needs fixing to get it to work reliably with multiple databases.

Multidb is awesome, but…

  • It needs more documentation.

  • Full support for multidb in schema migrations.

  • It needs better debugging tools (whiny transaction decorators).

  • Attributes like _for_write should be more clear. They’re pretty important, but the underscore looks like it is unimportant. (Comment: a core dev discussed with him during the questions; he thought this wasn’t necessary).

 
vanrees.org logo

Reinout van Rees

My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):