Djangocon: the future of postgresql in django - Marc Tamlyn

Tags: djangocon, django

(One of the summaries of a talk at 2014 djangocon.eu)

Marc Tamlyn is well-known for his succesful kickstarter campaign to improve postgresql support in django.

He asked for a show of hands: some 90% of the attendees use postgresql. It is the most popular database for django projects. Most of the core team favours it. It has a wide feature set and it is a proper active (non-oracle-owned) open source project.

Wide feature set… but not all of the features are supported by django. You can use several add-ons, but they’re not that nicely integrated. So he proposed a kickstarter project for creating django.contrib.postgres. 14k (in UK pounds) was raised by the kickstarter campaign! Wow. He thanked everyone that contributed (a couple of big contributions by companies by name).

The core of the project is to add support for a couple of data types, like array, json and hstore.

  • Array can list just about any type. Nested structure is OK. No foreign keys, though. This one is almost ready for django 1.8.

  • HStore is a key-value store. The most requested feature! It is quite similar to django-hstore, which is already a very good add-on. They’ll help him with getting HStore into django itself.

  • Json. There are at least 5 third party add-ons. It is quite messy, though. The native json data type is quite limited, basically it is simply stored as a string. Postgres 9.4, not yet released, plans jsonb: a binary json. Much nicer. This’ll likely be something for django 1.9 and it’ll only support jsonb.

  • Enum is an enumarated data type. Basically “choices on steriods”. Choices enforced at the database level. The field will proved python constants (you probably have choices you store as an integer field in the database with some python constants that map the integer to a useful string).

  • UUID: for when IDs are not sequential. Technically interesting: probably the UUID creation will be delegated to the database, which means it’ll need to be an autofield.

  • Interval: a time delta.

  • Range: start_something and end_something. You can add proper constraints (end is not before start). And you can enforce that ranges don’t overlap. And querying if something is inside a range are much easier and quicker.

How to get all this working? Django 1.7 supports custom lookups. Lookups and transforms. Currently they’re only supported in .filter(). Lookups are more elaborate, transforms are just for things like lowercasing something.

Postgres has a couple of special indexes you can use. Django will support them. They’re good for certain quite specific cases (a hash index is quicker only for equal-to queries, though in the question round it turned out that the hash index is being deprecated in postgres). You normally won’t need them unless you have a special case.

Expression indexes are much handier. You could make an index that first lowercases everything, for instance. Partial indexes: indexes only on certain items. Like “index this column only when sold==False”. Really useful when you only need to index a small portion of your table. Adding such an index will happen in the Meta of the model.

What will also be made: database views. Table-like object that you can query like it’s a table. They’re good for aggregates that might otherwise take a huge amount of time.

What might we do in the future? Possible features that are outside of the current kickstarter project:

  • Proper constraints. Constraints that are more in the database than in the python code.

  • Server side cursors. They can be handy for large data sets, but they’re quite complicated to implement. Talk to the core team if you’re interested.

  • Prepared statements. A bit the same story as the cursors above.

  • Native table inheritance. Postgres actually has its own build-in version of table inheritance. It is possible to build, but he doesn’t think it is a good idea.

  • A “clean” GIS implementation. django.contrib.gis is full of (necessary) hacks. But the changes for this kickstarter make it possible to make the postgres gis backend hack-free.

In the whole, the work done for this kickstarter should make other database extension work easier. Look forward to nice new projects in this area!

 
vanrees.org logo

About me

My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):