Python Leiden meetup: PostgreSQL + Python in 2026 – Aleksandr Dinu

Tags: python, pun

(One of my summaries of the Python Leiden meetup in Leiden, NL).

He’s going to revisit common gotchas of Python ORM usage. Plus some Postgresql-specific tricks.

ORM (object relational mappers) define tables, columns etc using Python concepts: classes, attributes and methods. In your software, you work with objects instead of rows. They can help with database schema management (migrations and so). It looks like this:

class Question(models.Model):
    question = models.Charfield(...)
    answer = models.Charfield(...)

You often have Python “context managers” for database sessions.

ORMs are handy, but you must be beware of what you’re fetching:

# Bad, grabs all objects and then takes the length using python:
questions_count = len(Question.objects.all())
# Good: let the database do it,
# the code does the equivalent of "SELECT COUNT(*)":
questions_count = Question.objects.all().count()

Relational databases allow 1:M and N:M relations. You use them with JOIN in SQL. If you use an ORM, make sure you use the database to follow the relations. If you first grab the first set of objects and then grab the second kind of objects with python, your code will be much slower.

“Migrations” generated by your ORM to move from one version of your schema to the next are real handy. But not all SQL concepts can be expressed in an ORM. Custom types, stored procedures. You have to handle them yourselves. You can get undesired behaviour as specific database versions can take a long time rebuilding after a change.

Migrations are nice, but they can lead to other problems from a database maintainer’s point of view, like the performance suddenly dropping. And optimising is hard as often you don’t know which server is connecting how much and also you don’t know what is queried. Some solutions for postgresql:

  • log_line_prefix = '%a %u %d" to show who is connecting to which database.

  • log_min_duration_statement = 1000 logs every query taking more than 1000ms.

  • log_lock_waits = on for feedback on blocking operations (like migrations).

  • Handy: feedback on the number of queries being done, as simple programming errors can translate into lots of small queries instead of one faster bigger one.

If you’ve found a slow query, run that query with EXPLAIN (ANALYZE, BUFFERS) the-query. BUFFERS tells you how many pages of 8k the server uses for your query (and whether those were memory or disk pages). This is so useful that they made it the default in postgresql 18.

Some tools:

  • RegreSQL: performance regression testing. You feed it a list of queries that you worry about. It will store how those queries are executed and compare it with the new version of your code and warn you when one of those queries suddenly takes a lot more time.

  • Squawk: tells you (in CI, like github actions) which migrations are backward-incompatible or that might take a long time.

  • You can look at one of the branching tools: aimed at getting access to production databases for testing. Like running your migration against a “branch”/copy of production. There are several tricks that are used, like filesystem layers. “pg_branch” and “pgcow” are examples. Several DB-as-a-service products also provide it (Databricks Lakebase, Neon, Heroku, Postgres.ai).

 
vanrees.org logo

Reinout van Rees

My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):