Reinout van Rees’ weblog¶
Utrecht (NL) Python meetup summaries¶
2026-05-21
I made summaries at the 4th PyUtrecht meetup (in Nieuwegein, at Qstars this time).
Qstars IT and open source - Derk Weijers¶
Qstars IT hosted the meeting. It is an infra/programming/consultancy/training company that uses lots of Python.
They also love open source and try to sponsor where possible.
One of the things they are going to open source (next week) is a “cable thermal model”, a calculation method to determine the temperature of underground electricity cables. The Netherlands has a lot of net congestion… So if you can have a better grid usage by calculating the real temperature of cables instead of using an estimated temperature, you might be able to increase the load on the cable without hitting the max temperature. Coupled with “measurement tiles” that actually monitor the temperature.
They build it for one of the three big electricity companies in the Netherlands and got permission to open source it so that the other companies can also use it. They hope it will have real impact.
He explained an open source project he started personally: “the space devs”. Integrating rocket launch data and providing an API. Now it has five core developers (and got an invitation to the biggest space conference, two years ago!)
Some benefits from writing open source:
You build your own portfolio.
You can try new technologies. Always nice to have the skill to learn new things.
You improve your communication skills (both sending and receiving).
You can make your own decisions.
You write in the open.
Perhaps you help others with your work.
You could be part of a cummunity.
It is your code.
How to start?
Reach out to other communities.
Read and improve documentation.
Find good first issues.
Be proactive.
Don’t be afraid to ask questions (and don’t let negative comments discourage you).
When working on open source, make sure you take security serious. People nowadays like to use supply chain attacks via open source software. So use 2FA and look at your deployment procedure.
Learning Python with Karel - EiEi Tun H¶
What is Karel? A teaching tool/robot for learning programming. You need to steer a robot in an area and have it pick up or dump objects. And… in the meantime you learn how to use functions and loops.
Karel only has a turn_left() function. So if you want to have it turn right, it is
handy to add a function for it:
def turn_right():
turn_left()
turn_left()
turn_left()
Simple, but you have to learn it sometime!
In her experience, AI can help a lot when learning to code: it explains stuff to you like you’re a five-year-old, and that’s perfect.
If you want to play with Karel: https://compedu.stanford.edu/karel-reader/docs/python/en/ide.html
JSON freedom or chaos; how to trust your data - Bart Dorlandt¶
For this talk, I’m pointing at the PyGrunn summary I made three weeks ago. I liked the talk!
Practical software architecture for Python developers - Henk-Jan van Hasselaar¶
There are several levels of architecture. Organization level. System level. Application, Code.
Cohesion: “the degree to which the elements inside a module belong together”. What does it mean? Working towards the same goal or function. Together means something like distance. When two functions are in separate libraries, they’re not together. It is also important for cognitive load.
Coupling: loose coupling versus high coupling. You want loose coupling, so that changes in one module don’t affect another module.
You don’t really have to worry about coupling and cohesion in existing systems that don’t need to be changed. But when you start changing or build something new: take coupling/cohesion into account.
Software architecture is a tradeoff. Seperation of concerns is fine, but it creates layers and thus distance, for instance.
Python is one of the most difficult languages when it comes to clean coding and clean architecture. You’re allowed to do so many dirty things! Typing isn’t even mandatory…
He showed a simple REST API as an example. Database model + view. But when you change the database model, like a field name, that field name automatically changes in the API response. So your internal database structure is coupled to the function at the customer that consumes the API.
What you actually need to do is to have a better “contract”. A domain model. In his example code, it was a Pydantic model with a fixed set of fields. A converter modifies the internal database model to the domain model.
You can also have services, generic pieces of code that work on domain models. And adapters to and from domain models, like converting domain models to csv.
Finding the balance is the software architect’s job.
What is the least you should do as a software developer? At least to create a domain layer. Including a validator.
There was a question about how to do this with Django: it is hard. Django’s models are everywhere. And you really need a clean domain layer…
PyGrunn: Python at Spotify: twenty years - Gijs Molenaar¶
2026-05-08
(One of my summaries of the 2026 one-day PyGrunn conference in Groningen, NL).
His parents owned a record store in some Dutch town. First records, then CDs. A social shop where you would gather to listen to CDs to determine whether to buy them. His father’s brother actually started the oldest record store in Amsterdam, Concerto. It still exists.
Then the world changed. Napster, CD-burners. Illegal downloading. (He himself was one of them). His parents stopped selling music in 2008. He himself got into engineering. He ended up in South Africa, doing workflow orchesration for radio telescopes. There he introduced Docker and containers. He gave a talk at Pygrunn about it in 2016.
While he was in the South African desert, in Sweden someone started the Spotify company. He actually had used a library (“luigi”) made by Spotify in his telescope work.
He tried to get a job at Spotify and succeeded. So the kid who grew up in a record store now works at the company that reinvented how people listen to music.
It all started for Spotify with Java (jboss 5). They hated it. It was replaced with Python: the reason was that nobody hated it. 80% of the code became python. A lot was async: they used “twisted” in the beginning, later gevent and greenlets.
But the Python GIL (global interpreter lock) made multi-core impossible. So you needed to use multiple processes, each with their own overhead. They also didn’t like the lack of type safety: they have 100+ services. Some of those problems are partially solved now, but at the time the switched back to Java. Partially it was cultural: they could hire quite some Oracle employees that knew Java.
Python was still used a lot, just not for the core services. Nowadays, Python is used a lot for machine learning. They have 950 Python services, 470 libraries. 180000 Python files in 7500 repositories. 322x FastApi, 272x Streamlit repositories. And still lots of luigi. Luigi is the framework that inspired airflow: it has lots of starts on github, the most of all their open source repositories.
They now also started pedalboard, a nice Pythonic way of modifying audio (it is a wrapper around a c++ library). Also nice: https://backstage.spotify.com/ , a backend/portal for collecting all the developer-related data. Workflow statuses and so. (The backend is open source, the dashboard not).
At Spotify, the programmers are really encouraged to use agentic programming. He hasn’t touched his editor in the last six months! It really changed his life. Initially he was a bit depressed: can someone who’s less talented but with the same amount of tokens really do the same as me? But it is really a next level and he gets amazing productivity out of it. Having unlimited tokens helps.
It changes open source. Forking used to be a declaration of war. Nowadays it is a sign of popularity. You can fork something and have AI keep it up to date with minimal engineer effort. When the cost of maintaining your own fork approaches zero, what does that do with the economics of open source? Is cooperation still a thing? What is the goal/effect of open sourcing? Or is it only a way for AIs to find security bugs in your software?
His parents ran a record store for 42 years. Then technology disrupted the music industry. They had to reinvent themselves. It was scary and sad, but they adapted. Now the same force is disrupting our industry. Where will it go?
Unrelated photo: the “lac de Kruth-Wildenstein” reservoir during a family holiday in France in 2006.
PyGrunn: list-man, pragmatic system integration - Doeke Zanstra¶
2026-05-08
(One of my summaries of the 2026 one-day PyGrunn conference in Groningen, NL).
When automating in a big company with many systems, you often end up with spaghetti: many systems connecting to a lot of the others… A common solution is to have a “bus architecture”. Generic existing “enterprise service bus” solutions were clearly overkill, so he proposed an alternative solution.
He made a couple of assumptions/choices. All data is tabular data. He wanted to store a copy of data in a database. SQL views to access the data. So: multiple sources that he wanted to import in a central database (which would function as a sort of “read-only enterprise service bus”). And a generic sql/view-based way of accessing the data.
He initially focused on read-only data. And he started real simple. Just a bash script that ran regularly that scraped data from other systems and injected it in the database.
In the second version of the system, for every system he wrote a target/command in a
Makefile. Every thing that needed to be scraped got its own table (called a “list”
in his system”). Lists could be compared. The first killer app was a comparison between
a telephone list and the list of employees so that differences could be consolidated.
For the third version, he started using more and more python. CSV file imports. Downloaders from REST APIs. All configurable so that he could use the same python script for many different sources.
He now had a simple sytem for which he could write views and exports.
Publishing data on the intranet via the “jekyll” static site generator. For instance a “mug book” of all employees.
And regularly exporting a list of names+emailaddresses in a format suitable for the multifunctional printer: to make it easy to select your email address when scanning on the printer.
An export to a google spreadsheet that combined the holiday spreadsheet with the data on part-time days.
Security was handled with a role-based system.
Unrelated photo: the “lac de Kruth-Wildenstein” reservoir during a family holiday in France in 2006.
PyGrunn: JSON freedom or chaos, how to trust your data - Bart Dorlandt¶
2026-05-08
(One of my summaries of the 2026 one-day PyGrunn conference in Groningen, NL).
Subtitle: a real-world journey from chaos to confidence using Pydantic and Pytest.
Idealy, you’d have perfect json files with a fixed format and rigorous validation and ideally generated. But in a customer project, the other programmers weren’t too happy about it. They had massive JSON files, partially manually crafted. Some where just one single line and others were vertically aligned. And perhaps someone depended on the specific format for some “sed” or “awk” hacking… So whatever happens: it works, don’t touch it.
The freedom trap. No schema means no contract. No contract means no trust. Fields accumulate, nobody removes them: “someone might be using it”. Multi-team challenges: not everyone has the same skillset.
He wanted a different future: a trusted future. Validated and tested and formatted.
Pydantic is a python library for data validation using Python type annotations. You can define a data model with type hints. it will automatically validate and parse data according to those models:
from ipaddress import IPv4Address
from pydantic import BaseModel
class Server(BaseModel):
hostname: str
ip: IPv4Address
...
Make sure to look at pydantic-extra-types, they have lots of handy types like
“two-character country code”.
There’s AfterValidator, you can use it to add a second validator to a field. So
first the str type to validate it is a string, then afterwards some ip address
validator or so.
Understanding the data is important. Split it up in smaller pieces and try to understand/model/validate those. Especially in a corporate setting, splitting up the problem is handy: you have some small success you can mention at the standup :-)
Do it iteratively. One piece at a time. If you find a problem, create a ticket for it. It might not get fixed, but at least you end up with a list you can slowly tackle with the rest of the organisation.
A good tip: if you discover an error in the data, provide a good, clear error message that your colleague can understand.
When you export the data, use model_dump(exclude_optional=True) to exclude all the
optional fields instead of having it as my_field: None.
Bonus: you can call YourModel.model_json_schema() to generate a JSON schema for
the Pydantic model. You can then use the JSON schema in vscode when you manually edit
your JSON.
Pydantic is great at validating individual fields and structures. But not at validating things that span the entire document, like making sure that all hostnames are unique. He used Pytest for it: he wrote such validation checks as pytest functions!. You can even use Pytest test parametrization to run the same test on multiple directories.
Unrelated photo: the “lac de Kruth-Wildenstein” reservoir during a family holiday in France in 2006.
PyGrunn: introducing httpxyz: forking a top-100 Python package - Michiel Beijen¶
2026-05-08
(One of my summaries of the 2026 one-day PyGrunn conference in Groningen, NL).
Years ago he listened to the “corecursive” podcast (recommended by Michiel), the one
where Yann Collet got interviewed. He’s the author of the LZ4
and zstandard (zstd) compression algorithm. In 2016 zstandard was released. In 2017 it was used
in the linux kernel. Since 2020 it is one of the official formats in zipfiles. And in
2025 it got added to the Python standard library in version 3.14.
requests is one of the most popular Python libraries. httpx has a similar API,
but it is better. A top 100 pypy packages. Main advantages: HTTP/2 support and async
support.
He liked httpx a lot. And zstandard, too. But zstandard wasn’t supported by httpx. All browsers support it, but not httpx. So he made a pull request in early 2024. It got merged! But there was no new release yet. The maintainer asked if he wanted to create a PR for the release. He did it and there was a new release. Hurray!
Months later, a bug surfaced. He created a bugfix, but that wasn’t merged and wasn’t merged and wasn’t merged. And there was no new release. And then the httpx maintainer recently turned off all discussion on github. Earlier the maintainer had done the same to django restframework. And to mkdocs. All heavily-used packages! And in the “encode” github organisation/company that uses donations to fund open source development. Weird…
There are also performance issues in httpx, which especially is a problem for several AI libraries.
So… he started httpxyz, it bills itself as the maintained fork of httpx. More info about the reasons for the fork at https://tildeweb.nl/~michiel/httpxyz.html .
It contains most of the bugfixes that have been pending for a while. More maintainers.
Performance is much better (they needed to fork httpcore into httpcorexyz, it is 4x faster).
API compatible. You just have to change the import. They used a PIL/pillow trick to make
sure that if you import httpxz, later httpx imports use httpxyz instead.
There turned out to be quite a lot of small performance errors in the old code.
An important performance tip: use client (or if you use requests, use request.Session()):
import httpxyz
c = httpxyz.Client()
c.get(...)
c.get(...)
instead of just:
import httpxyz
httpxyz.get(...)
httpxyz.get(...)
Using a client means httpxyz (or requests) can use http features to spead up your
requests a lot. Automatic connection keepilive. No more TCP handschake for every
individual request. And no TLS/https handshake. And if your server supports http/2, the
improvement is even bigger. You do need to install httpxyz[http2] and specifiy
httpxyz.Client(http2=True).
Nice: httpxyz also has a command line interface.
Something he only mentioned briefly: there oauth2 client_credentials support. You have to define a way to grab an oauth2 token, but the rest of the client work just uses the regular methods. Handy.
They’re on https://codeberg.org/httpxyz/httpxyz instead of on github.
Unrelated photo: the “lac de Kruth-Wildenstein” reservoir during a family holiday in France in 2006.
PyGrunn: how to sort and route your (physical) mail - Bart Dorlandt¶
2026-05-08
(One of my summaries of the 2026 one-day PyGrunn conference in Groningen, NL).
Full title: how to store and route your (physical) mail like a pro - personal edition.
How do you deal with your mail? Your physical mail? How do you store it? If the tax people want to have some information, can you find it, for instance?
Bart’s motto is there must be a better way. So what is the pragmatic approach to better physical mail handling? A mail handling system that is flexible, automated, searchable and easy to use.
He discovered paperless-ngx, an open source document management system that allow you to store, organize and search your documents. Web interface, api, it can also read emails (via the “gotenburg” plugin). It can watch folders for new docs to process. It has features for structuring, self-improving (without AI). Tags. And you can have workflows.
Nice. Documents can go to Paperless. But he still has his bookkeeping system (he has his own company). And the bookkeeper wants emails with documents that are in Paperless. Can he improve this? For instance for receipts. He didn’t want to scan all of them to PDF. And regular phone cameras don’t produce PDFs.
He started using “dropbox camera”. It works great for scanning receipts and documents. It recognizes corners and pages and enhances the contrast. It produces PDFs and uploads them to dropbox. (You must accept the fact that it ends up in the cloud: he build all this pre-Trump…)
He has a Synology NAS at home. That has a CloudSync app that you can use to sync the dropbox folder to the NAS.
He wanted to make some python glue gode. Ability to send to multiple destinations. Process folders for new files. Moving files to a “done” folder. Python looks at the various folders: he configured a specific custom “processor” per folder. So a move-to-paperless processor, for instance. And a processor that emails the scanned receipts directly to the bookkeeper.
Lots of it is automated. Just drop a PDF in a folder and the system takes care of it. Once in a while he checks Paperless and categorizes/stores what’s left in the inbox.
It was a personal project, so he used it to experiment with Dataclass and Protocol. Don’t forget to learn when you create/automate something for yourself.
He finds it awesome that something this easy saves him hours! What can you automate in your life?
Unrelated photo: the “lac de Kruth-Wildenstein” reservoir during a family holiday in France in 2006.
PyGrunn: layered architecture - Mike Huls¶
2026-05-08
(One of my summaries of the 2026 one-day PyGrunn conference in Groningen, NL).
Full title: layered architecture for readable, robust, and extensible apps.
Note: there’s a related article on his own website :-)
Layered architecture resonates with people that make okay applications: their application do what they need to do. But once people start asking for changes, they get nervous. There might be huge functions. Or there might be no tests, “as it takes too much time to spin up the database”. Brittle applications. Small changes are disproportionally expensive.
The goal of this talk: create apps that are readable, robust and extensible. By using the principle of separating everything in layers with a specific responsibility. It is not a one-size-fits-all solution: you have to adapt it to your situation.
The layers that he proposes:
Interface: how the ouside world calls your application. An API or UI.
Infrastructure and Repository: your contact with the outside world (like a database).
Infrastructure is tools. A http client. A mail sender.
Repository: persistence. SQL queries, caches. The aim is to decouple the rest of the system from db/cache/etc.
Application: heart of your system, orchestrating the business logic. The Interface talks to the Application layer, the Application layer talks the infra/repo. And uses the Domain layer.
Domain: constraints and definitions. He often uses Pydantic models here. It reflects the business meaning. It should be strict. Fail early. The “language” used should be a shared language between the engineers and the business people.
There are some rules, like the Interface only talks to the Application, not directly to
the Infrastructure. And your code should be structured the same way. So a repo/ dir,
an infra/ dir etc.
What are the benefits?
It is more readable, you know where stuff is. This also helps with onboarding.
It is more understandable, also to business people.
Your app will be much more maintainable.
Structure is clearer.
Because you have more separation between concerns, validation is easier, so you tend to do more of it.
Evolvable. You can build upon your existing code instead of modifying it.
How to get started?
Start with separate directories. If you wonder where a function should go, it probably has too many responsibilities :-)
Add tests.
Start small.
Focus on validation. Fail early.
Isolate the business logic.
Concentrate on the borders and separations.
Something to watch out for is making your models too big. You might have to split it into separate systems with their own responibility. A payment system, separate from the inventory system, for instance. You might want to create a small, focused shared domain system.
Unrelated photo: the “lac de Kruth-Wildenstein” reservoir during a family holiday in France in 2006.
Djangocon EU: Django templates on the frontend? - Christophe Henry¶
2026-04-17
(One of my summaries of the 2026 Djangocon EU in Athens).
It all started with formsets: you generate a new form based on other forms. You can use it to create pretty fancy forms. But your designer can get quite creative. And you might have variable forms that have to react to user input.
A common solution is to use htmx, but that means server requests all the time. And some users have really bad connections. Regular requests aren’t handy in that scenario.
He looked at django-rusty-templates: Django’s template engine implemented in Rust. It had a template parser that he could re-use. With OXC (javascript oxidation compiler) he converted that to javascript.
That way, he could offload much of the django form creation handling to the frontend, including reacting to user input and showing alerts.
The work-in-progress project is called django-template-transpiler: https://github.com/christophehenry/django-template-transpiler . Don’t use it for production.
Unrelated photo explanation: a cat I encountered in Athens on an evening stroll in the neighbourhood behind the hotel.
Djangocon EU: auto-prefetching with model field fetch modes in Django 6.1 - Jacob Walls¶
2026-04-17
(One of my summaries of the 2026 Djangocon EU in Athens).
There’s an example to experiment with here: https://dryorm.xterm.info/fetch-modes-simple
Timeline: it will be included in Django 6.1 in August.
The reason is the 1+n problem:
books = Book.objects.all()
for book in books:
print(book.author.name)
# This does a fresh query for author every time.
You can solve it with select_related(relation_names) or
prefetch_related(relation_names). The first does an inner join. The second does two
queries.
But: you might miss a relation. You might specify too many relations, getting data you don’t need. Or you might not know about the relation as the code is in a totally different part of the code.
Fetch mode is intended to solve it. You can append .fetch_mode(models.FETCH_xyz)
to your query:
models.FETCH_ONE: the current behaviour, which will be the default.models.FETCH_PEERS: Fetch a deferred field for all instances that came from the same queryset. More or lessprefetch_relatedin an automatic, lazy manner.models.FETCH_RAISE: useful for development, it will raiseFieldFetchBlocked. And it will thus tell you that you’ll have a performance problem and that you might need FETCH_PEERS
This is what happens:
books = Book.objects.all().fetch_mode(models.FETCH_PEERS)
for book in books:
# We're iterating over the query, so the query executes and grabs all books.
print(book.author.name)
# We accessed a relation, so at this point the prefetch_related-like
# mechanism ist fired off and all authors linked to by the books are
# grabbed in one single query.
You can write your own fetch modes, for instance if you only want a warning instead of raising an error.
Unrelated photo explanation: a cat I encountered in Athens on an evening stroll in the neighbourhood behind the hotel.
Djangocon EU: zero-migration encryption - Vjeran Grozdanic¶
2026-04-17
(One of my summaries of the 2026 Djangocon EU in Athens).
Full title: zero-migration encryption: building drop-in encrypted field in Django.
He works at Sentry. Huge site with a Django backend and thousands requests per second.
He had to add a new table to store 3rd party API credentials. Oh: should this be encrypted? Yes. But: each team has its own way to encrypt data. And there were at least 10 encryption keys here and there (as environment variables). And tens of places where encryption/decryption happens.
So: better to build a generic solution. Or use an existing generic solution. And yes,
there are multiple libraries. EncryptedCharField looked nice. But the problem was
all the existing data in the various places. Sentry is not a site that you can shut down
for a while, so you have to do it with zero downtime. This means you can never change an
existing column type.
A solution could be to add a new encrypted field next to the existing one. Then fill it and backfill it and make sure no new data is written to the old field and then you can remove the old field. But that’s quite a job with all the different locations that had to be changed.
A Field class in Django has get_prep_value() and from_db_value(). Those are
called before storing data in the database and after grabbing it from the database. You
could create a new CharField-like field and start to encrypt values in
get_prep_value and decrypt the other way.
You’d have to be able to recognise the old un-encrypted values. A solution: prefix
encrypted values with enc:. Also key rotation can be handled this way, by including
that in the prefix (enc:key2:).
But there’s also a bjson field. They solved that by encrypting the json and writing a json to the database with the encrypted json in a field and also the encryption key info.
The code is in the sentry repo .
Unrelated photo explanation: a cat I encountered in Athens on an evening stroll in the neighbourhood behind the hotel.
Overview by year¶
Statistics: charts of posts per year and per month.
- Weblog entries for 2003
- Weblog entries for 2004
- Weblog entries for 2005
- Weblog entries for 2006
- Weblog entries for 2007
- Weblog entries for 2008
- Weblog entries for 2009
- Weblog entries for 2010
- Weblog entries for 2011
- Weblog entries for 2012
- Weblog entries for 2013
- Weblog entries for 2014
- Weblog entries for 2015
- Weblog entries for 2016
- Weblog entries for 2017
- Weblog entries for 2018
- Weblog entries for 2019
- Weblog entries for 2020
- Weblog entries for 2021
- Weblog entries for 2022
- Weblog entries for 2023
- Weblog entries for 2024
- Weblog entries for 2025
- Weblog entries for 2026
- Tag overview