Djangocon: representing hierarchies in relational databases - Jacob Rief

Tags: djangocon, django

(One of my summaries of a talk at the 2018 european djangocon.)

Jacob works with django since 2011 and he created and maintains a couple of django apps. One of the projects he took over maintenance of is django-treebeard.

django-treebeard can be used to create hierarchical structures in your relational database. A tree structure. Like the structure of a company. Or nested categories in a shopping website.

There are four common solutions to get a hierarchical relation in your database:

  • Adjecent list. Basically you add a parent foreign key to your model. Traversing the tree has bad performance, as you have to recurse through every “level” to find all parents or children of a node.

    It gets worse if you also want to filter. You can

  • Materialized path tree. You give every node a special “path” field. The first element is A, its children are AA, AB, AC. One level down AAA, AAB, ABA, ABB, ABC, etc.

    Now if you want to find the children of ‘AAB’, you search for all items with a path that starts with ‘AAB’. (You have to exclude the current node, though). As long as you have a good index, the performance is good.

    If you want to find all parents, you need to create a list of paths in python and query on that. For ‘AAB’, you look for ‘AA’ and ‘A’. Quite OK.

  • Nested sets tree. Each node gets two numbers. Everything in between the numbers is “below” the node. So if you add a node under an existing node, the “after” number is increased by two and the new node gets the two now-free numbers.

    Search performance is good. Parents have a minimum number that is smaller than our own. Children have numbers “between” our min/max number. Query speed is good.

    Adding a node is expensive, as you have to increase numbers on a lot of nodes!

  • Adjacent list tree with common table expressions. Newer databases all support this. The advantage over the non-table-expression solution (the first of the four options) is that it all happens in SQL, so the speed issues are mostly gone.

Every one of the four approaches have drawbacks and advantages.

There are three django apps that implement trees:

  • django-mptt: nested sets. Django CMS used this, but has now switched to django-treebeard.
  • django-treebeard: implements the first three.
  • django-tree: the fourth, but postgres only (and in alpha).

Note: there is a django ticket #28919 which has a feature request for support of the common table expressions of the fourth option.

Photo explanation: I’m experimenting with the color and strength of the LEDs.

blog comments powered by Disqus logo

About me

My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):