Djangocon: making smarter queries with advanced ORM resources - Mariana Bedran Lesche

Tags: djangocon, django

(One of my summaries of a talk at the 2018 european djangocon.)

There are details in django’s ORM/model/query documentation that you often skip over. But there are very useful things in there.

Recently she had to work on an old PHP project where she had to write SQL by hand. And some of those raw SQL statements, she couldn’t get working right away with django. So she went back to the harder sections of the documentation.

When should you use the advanced methods? For instance when the regular optimizations don’t give enough results:

  • Good indexes on your tables.
  • Paginated requests.
  • select_related, prefetch_related are there.
  • You limit the output with values, values_list, only and defer.
  • You use a database cache.
  • You use the assertNumQueries on tests to figure out if too many queries are made.

Mariana comes from Brazil, where they have a law that says that all non-secret goverment information should be made available. There’s the https://brasil.io project that tries to make the data available in useful datasets (as the source data isn’t always in a useful format). That’s where she got her sample data from: info on companies and their owners. Owners can be regular persons, other companies or foreign owners. The second dataset is of money spend by public officials, which is relatable to companies.

To get a good performance out of the relational data, select_related and prefetch_related were essential to limit the number of queries. But when querying the officials, you’d get lots of data. So she wanted to filter on just a specific month. For that, she first did the filtering and then used a Prefetch object to do the prefetching.

You can also filter on annotations (I couldn’t write down the sample query in time, though). For that, Q() objects came in handy. With that, you can create queries programmatically. She showed some timing results and the difference was striking: doing filtering in the database can make your code much quicker (it was a factor of 100 or so).

Programmatically generating Q() objects is helped a lot if you use python 3.6’s “f strings”. See this blog entry for a nice introduction on what they are.

Some quick, probably faulty, sample code:

filters = []
for year in [2017, 2018]:
    q = Q(f'expenses_in_{year}__gt'=100)
    filters.append(q)
Something.objects.filter(*filters)
https://abload.de/img/screenshot2018-03-09ay8qci.png

Photo explanation: open door, forklift, shelf in the back: a nice detail

blog comments powered by Disqus
 
vanrees.org logo

About me

My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):