Reinout van Rees’ weblog

Python Leiden (NL) meetup summaries

2026-07-02

Tags: python, pun

Two summaries of the July 2 2026 Python meetup in Leiden. I’ve omitted one, “Python with Karel” by EiEi Tun, as I’ve made a summary of that talk in Utrecht a month ago, already :-)

Building modern internal team CLIs with incremental automation - Farid Nouri Neshat

Obligatory xkcd cartoons: https://xkcd.com/974 and https://xkcd.com/1319 and https://xkcd.com/1205

Toil: manual, repetitive, automatable, distracting you from your real work, no enduring value. Yes, he likes to automate things :-) Some examples of repetitive manual tasks:

  • Creating dev containers.

  • Gathering data for troubleshooting.

  • Something that needs to be set manually in a database.

  • Setting up a new AWS account.

  • Creating a new dev environment on the new colleague’s laptop.

How to automate? Do it iteratively! Your boss might not like you to spend a day automating the task. But if you do it small steps at a time…

  • Do it manually the very first time.

  • Then start with documenting the steps.

  • Then turn it into a do-nothing scaffold script:

    def step1():
        print("Open the AWS page manually")
        input("Press enter to continue")
    
  • Everytime you do the task, automate a small bit and flesh out the script over time.

  • After many iterations, you’ll have automated it fully!

“I don’t have time to automate it”, you might say? Well, why don’t you have time? Is it perhaps because you haven’t automated things?

A good motivator: if you hate the task… Hate driven development :-)

After a while, you’ll have lots of random scripts. Stuff them in a repository. Slowly document them. Try to get them to use the same conventions. Perhaps you can re-use functionality in a library.

Something you need quicky is some CLI, a command line interface. He likes typer to make his CLIs: much nicer than Python’s own “argparse”:

import typer

app = typer.Typer()


@app.command()
def hello(name: str):
    print(f"Hello {name}")


if __name__ == "__main__":
    app()

AI comment: AI agents can use your CLI. Use the docstring and help functions to help orient the AI to your custom CLI. You can, for instance, use a CLI to give the agent access to your database’s content without giving it direct access to the database.

AI agents can be dangerous. A solution might be to use “feature flags”. You can disable production access until you enable some setting or flag that AI doesn’t know about.

He also mentioned the rich library for formatting and colorizing your textual output.

What I’ve learned maintaining the MCP Python SDK - Marcelo Trylesinski

He’s one of the three maintainers of the MCP Python SDK. SDK = software development kit. MCP: model context protocol, so a way for AI agents to connect to some other piece of software.

MCP is basically “OpenAPI for your agents”. It exposes three things from the server side:

  • tools

  • resources

  • prompts (though tools are mostly the only thing that is used)

The client provides:

  • sampling

  • elicitation (=”producing a reaction”, so mostly it means that the AI server asks you questions)

  • roots

  • logging

The MCP spec kept growing. But clients never caught up, so it was mostly only the “tools” part that got used.

A big problem is that servers cannot scale. The AI server might have lots of machines with a loadbalancer in front of it, but as a user you need to stay connected to the one machine that has your context.

There’s a new version of the spec (final version this month) that actually removed stuff, instead of growing. The “client provides” list mentioned above? Sampling, roots and logging are gone as they were hardly used.

MCP is now a small core, with optional extensions. Examples: tasks, MCP apps, enterprise auth.

The MCP Python SDK supports the new version, too. He demonstrated a small Python script that had a function that said you could have three bananas. He connected it via MCP to Claude and could ask Claude for the number of available bananas. It got back, via the Python tool, with the correct answer.

Utrecht (NL) Python meetup summaries

2026-05-21

Tags: python, pun, django

I made summaries at the 4th PyUtrecht meetup (in Nieuwegein, at Qstars this time).

Qstars IT and open source - Derk Weijers

Qstars IT hosted the meeting. It is an infra/programming/consultancy/training company that uses lots of Python.

They also love open source and try to sponsor where possible.

One of the things they are going to open source (next week) is a “cable thermal model”, a calculation method to determine the temperature of underground electricity cables. The Netherlands has a lot of net congestion… So if you can have a better grid usage by calculating the real temperature of cables instead of using an estimated temperature, you might be able to increase the load on the cable without hitting the max temperature. Coupled with “measurement tiles” that actually monitor the temperature.

They build it for one of the three big electricity companies in the Netherlands and got permission to open source it so that the other companies can also use it. They hope it will have real impact.

He explained an open source project he started personally: “the space devs”. Integrating rocket launch data and providing an API. Now it has five core developers (and got an invitation to the biggest space conference, two years ago!)

Some benefits from writing open source:

  • You build your own portfolio.

  • You can try new technologies. Always nice to have the skill to learn new things.

  • You improve your communication skills (both sending and receiving).

  • You can make your own decisions.

  • You write in the open.

  • Perhaps you help others with your work.

  • You could be part of a cummunity.

  • It is your code.

How to start?

  • Reach out to other communities.

  • Read and improve documentation.

  • Find good first issues.

  • Be proactive.

  • Don’t be afraid to ask questions (and don’t let negative comments discourage you).

When working on open source, make sure you take security serious. People nowadays like to use supply chain attacks via open source software. So use 2FA and look at your deployment procedure.

Learning Python with Karel - EiEi Tun H

What is Karel? A teaching tool/robot for learning programming. You need to steer a robot in an area and have it pick up or dump objects. And… in the meantime you learn how to use functions and loops.

Karel only has a turn_left() function. So if you want to have it turn right, it is handy to add a function for it:

def turn_right():
    turn_left()
    turn_left()
    turn_left()

Simple, but you have to learn it sometime!

In her experience, AI can help a lot when learning to code: it explains stuff to you like you’re a five-year-old, and that’s perfect.

If you want to play with Karel: https://compedu.stanford.edu/karel-reader/docs/python/en/ide.html

JSON freedom or chaos; how to trust your data - Bart Dorlandt

For this talk, I’m pointing at the PyGrunn summary I made three weeks ago. I liked the talk!

Practical software architecture for Python developers - Henk-Jan van Hasselaar

There are several levels of architecture. Organization level. System level. Application, Code.

Cohesion: “the degree to which the elements inside a module belong together”. What does it mean? Working towards the same goal or function. Together means something like distance. When two functions are in separate libraries, they’re not together. It is also important for cognitive load.

Coupling: loose coupling versus high coupling. You want loose coupling, so that changes in one module don’t affect another module.

You don’t really have to worry about coupling and cohesion in existing systems that don’t need to be changed. But when you start changing or build something new: take coupling/cohesion into account.

Software architecture is a tradeoff. Seperation of concerns is fine, but it creates layers and thus distance, for instance.

Python is one of the most difficult languages when it comes to clean coding and clean architecture. You’re allowed to do so many dirty things! Typing isn’t even mandatory…

He showed a simple REST API as an example. Database model + view. But when you change the database model, like a field name, that field name automatically changes in the API response. So your internal database structure is coupled to the function at the customer that consumes the API.

What you actually need to do is to have a better “contract”. A domain model. In his example code, it was a Pydantic model with a fixed set of fields. A converter modifies the internal database model to the domain model.

You can also have services, generic pieces of code that work on domain models. And adapters to and from domain models, like converting domain models to csv.

Finding the balance is the software architect’s job.

What is the least you should do as a software developer? At least to create a domain layer. Including a validator.

There was a question about how to do this with Django: it is hard. Django’s models are everywhere. And you really need a clean domain layer…

PyGrunn: Python at Spotify: twenty years - Gijs Molenaar

2026-05-08

Tags: python, pygrunn

(One of my summaries of the 2026 one-day PyGrunn conference in Groningen, NL).

His parents owned a record store in some Dutch town. First records, then CDs. A social shop where you would gather to listen to CDs to determine whether to buy them. His father’s brother actually started the oldest record store in Amsterdam, Concerto. It still exists.

Then the world changed. Napster, CD-burners. Illegal downloading. (He himself was one of them). His parents stopped selling music in 2008. He himself got into engineering. He ended up in South Africa, doing workflow orchesration for radio telescopes. There he introduced Docker and containers. He gave a talk at Pygrunn about it in 2016.

While he was in the South African desert, in Sweden someone started the Spotify company. He actually had used a library (“luigi”) made by Spotify in his telescope work.

He tried to get a job at Spotify and succeeded. So the kid who grew up in a record store now works at the company that reinvented how people listen to music.

It all started for Spotify with Java (jboss 5). They hated it. It was replaced with Python: the reason was that nobody hated it. 80% of the code became python. A lot was async: they used “twisted” in the beginning, later gevent and greenlets.

But the Python GIL (global interpreter lock) made multi-core impossible. So you needed to use multiple processes, each with their own overhead. They also didn’t like the lack of type safety: they have 100+ services. Some of those problems are partially solved now, but at the time the switched back to Java. Partially it was cultural: they could hire quite some Oracle employees that knew Java.

Python was still used a lot, just not for the core services. Nowadays, Python is used a lot for machine learning. They have 950 Python services, 470 libraries. 180000 Python files in 7500 repositories. 322x FastApi, 272x Streamlit repositories. And still lots of luigi. Luigi is the framework that inspired airflow: it has lots of starts on github, the most of all their open source repositories.

They now also started pedalboard, a nice Pythonic way of modifying audio (it is a wrapper around a c++ library). Also nice: https://backstage.spotify.com/ , a backend/portal for collecting all the developer-related data. Workflow statuses and so. (The backend is open source, the dashboard not).

At Spotify, the programmers are really encouraged to use agentic programming. He hasn’t touched his editor in the last six months! It really changed his life. Initially he was a bit depressed: can someone who’s less talented but with the same amount of tokens really do the same as me? But it is really a next level and he gets amazing productivity out of it. Having unlimited tokens helps.

It changes open source. Forking used to be a declaration of war. Nowadays it is a sign of popularity. You can fork something and have AI keep it up to date with minimal engineer effort. When the cost of maintaining your own fork approaches zero, what does that do with the economics of open source? Is cooperation still a thing? What is the goal/effect of open sourcing? Or is it only a way for AIs to find security bugs in your software?

His parents ran a record store for 42 years. Then technology disrupted the music industry. They had to reinvent themselves. It was scary and sad, but they adapted. Now the same force is disrupting our industry. Where will it go?

https://reinout.vanrees.org/images/2026/lac-de-kruth7.jpg

Unrelated photo: the “lac de Kruth-Wildenstein” reservoir during a family holiday in France in 2006.

PyGrunn: list-man, pragmatic system integration - Doeke Zanstra

2026-05-08

Tags: python, pygrunn

(One of my summaries of the 2026 one-day PyGrunn conference in Groningen, NL).

When automating in a big company with many systems, you often end up with spaghetti: many systems connecting to a lot of the others… A common solution is to have a “bus architecture”. Generic existing “enterprise service bus” solutions were clearly overkill, so he proposed an alternative solution.

He made a couple of assumptions/choices. All data is tabular data. He wanted to store a copy of data in a database. SQL views to access the data. So: multiple sources that he wanted to import in a central database (which would function as a sort of “read-only enterprise service bus”). And a generic sql/view-based way of accessing the data.

He initially focused on read-only data. And he started real simple. Just a bash script that ran regularly that scraped data from other systems and injected it in the database.

In the second version of the system, for every system he wrote a target/command in a Makefile. Every thing that needed to be scraped got its own table (called a “list” in his system”). Lists could be compared. The first killer app was a comparison between a telephone list and the list of employees so that differences could be consolidated.

For the third version, he started using more and more python. CSV file imports. Downloaders from REST APIs. All configurable so that he could use the same python script for many different sources.

He now had a simple sytem for which he could write views and exports.

  • Publishing data on the intranet via the “jekyll” static site generator. For instance a “mug book” of all employees.

  • And regularly exporting a list of names+emailaddresses in a format suitable for the multifunctional printer: to make it easy to select your email address when scanning on the printer.

  • An export to a google spreadsheet that combined the holiday spreadsheet with the data on part-time days.

Security was handled with a role-based system.

https://reinout.vanrees.org/images/2026/lac-de-kruth6.jpg

Unrelated photo: the “lac de Kruth-Wildenstein” reservoir during a family holiday in France in 2006.

PyGrunn: JSON freedom or chaos, how to trust your data - Bart Dorlandt

2026-05-08

Tags: python, pygrunn

(One of my summaries of the 2026 one-day PyGrunn conference in Groningen, NL).

Subtitle: a real-world journey from chaos to confidence using Pydantic and Pytest.

Idealy, you’d have perfect json files with a fixed format and rigorous validation and ideally generated. But in a customer project, the other programmers weren’t too happy about it. They had massive JSON files, partially manually crafted. Some where just one single line and others were vertically aligned. And perhaps someone depended on the specific format for some “sed” or “awk” hacking… So whatever happens: it works, don’t touch it.

The freedom trap. No schema means no contract. No contract means no trust. Fields accumulate, nobody removes them: “someone might be using it”. Multi-team challenges: not everyone has the same skillset.

He wanted a different future: a trusted future. Validated and tested and formatted.

Pydantic is a python library for data validation using Python type annotations. You can define a data model with type hints. it will automatically validate and parse data according to those models:

from ipaddress import IPv4Address
from pydantic import BaseModel

class Server(BaseModel):
    hostname: str
    ip: IPv4Address
    ...

Make sure to look at pydantic-extra-types, they have lots of handy types like “two-character country code”.

There’s AfterValidator, you can use it to add a second validator to a field. So first the str type to validate it is a string, then afterwards some ip address validator or so.

Understanding the data is important. Split it up in smaller pieces and try to understand/model/validate those. Especially in a corporate setting, splitting up the problem is handy: you have some small success you can mention at the standup :-)

Do it iteratively. One piece at a time. If you find a problem, create a ticket for it. It might not get fixed, but at least you end up with a list you can slowly tackle with the rest of the organisation.

A good tip: if you discover an error in the data, provide a good, clear error message that your colleague can understand.

When you export the data, use model_dump(exclude_optional=True) to exclude all the optional fields instead of having it as my_field: None.

Bonus: you can call YourModel.model_json_schema() to generate a JSON schema for the Pydantic model. You can then use the JSON schema in vscode when you manually edit your JSON.

Pydantic is great at validating individual fields and structures. But not at validating things that span the entire document, like making sure that all hostnames are unique. He used Pytest for it: he wrote such validation checks as pytest functions!. You can even use Pytest test parametrization to run the same test on multiple directories.

https://reinout.vanrees.org/images/2026/lac-de-kruth5.jpg

Unrelated photo: the “lac de Kruth-Wildenstein” reservoir during a family holiday in France in 2006.

PyGrunn: introducing httpxyz: forking a top-100 Python package - Michiel Beijen

2026-05-08

Tags: python, django, pygrunn

(One of my summaries of the 2026 one-day PyGrunn conference in Groningen, NL).

Years ago he listened to the “corecursive” podcast (recommended by Michiel), the one where Yann Collet got interviewed. He’s the author of the LZ4 and zstandard (zstd) compression algorithm. In 2016 zstandard was released. In 2017 it was used in the linux kernel. Since 2020 it is one of the official formats in zipfiles. And in 2025 it got added to the Python standard library in version 3.14.

requests is one of the most popular Python libraries. httpx has a similar API, but it is better. A top 100 pypy packages. Main advantages: HTTP/2 support and async support.

He liked httpx a lot. And zstandard, too. But zstandard wasn’t supported by httpx. All browsers support it, but not httpx. So he made a pull request in early 2024. It got merged! But there was no new release yet. The maintainer asked if he wanted to create a PR for the release. He did it and there was a new release. Hurray!

Months later, a bug surfaced. He created a bugfix, but that wasn’t merged and wasn’t merged and wasn’t merged. And there was no new release. And then the httpx maintainer recently turned off all discussion on github. Earlier the maintainer had done the same to django restframework. And to mkdocs. All heavily-used packages! And in the “encode” github organisation/company that uses donations to fund open source development. Weird…

There are also performance issues in httpx, which especially is a problem for several AI libraries.

So… he started httpxyz, it bills itself as the maintained fork of httpx. More info about the reasons for the fork at https://tildeweb.nl/~michiel/httpxyz.html .

It contains most of the bugfixes that have been pending for a while. More maintainers. Performance is much better (they needed to fork httpcore into httpcorexyz, it is 4x faster). API compatible. You just have to change the import. They used a PIL/pillow trick to make sure that if you import httpxz, later httpx imports use httpxyz instead.

There turned out to be quite a lot of small performance errors in the old code.

An important performance tip: use client (or if you use requests, use request.Session()):

import httpxyz

c = httpxyz.Client()
c.get(...)
c.get(...)

instead of just:

import httpxyz

httpxyz.get(...)
httpxyz.get(...)

Using a client means httpxyz (or requests) can use http features to spead up your requests a lot. Automatic connection keepilive. No more TCP handschake for every individual request. And no TLS/https handshake. And if your server supports http/2, the improvement is even bigger. You do need to install httpxyz[http2] and specifiy httpxyz.Client(http2=True).

Nice: httpxyz also has a command line interface.

Something he only mentioned briefly: there oauth2 client_credentials support. You have to define a way to grab an oauth2 token, but the rest of the client work just uses the regular methods. Handy.

They’re on https://codeberg.org/httpxyz/httpxyz instead of on github.

https://reinout.vanrees.org/images/2026/lac-de-kruth4.jpg

Unrelated photo: the “lac de Kruth-Wildenstein” reservoir during a family holiday in France in 2006.

PyGrunn: how to sort and route your (physical) mail - Bart Dorlandt

2026-05-08

Tags: python, pygrunn

(One of my summaries of the 2026 one-day PyGrunn conference in Groningen, NL).

Full title: how to store and route your (physical) mail like a pro - personal edition.

How do you deal with your mail? Your physical mail? How do you store it? If the tax people want to have some information, can you find it, for instance?

Bart’s motto is there must be a better way. So what is the pragmatic approach to better physical mail handling? A mail handling system that is flexible, automated, searchable and easy to use.

He discovered paperless-ngx, an open source document management system that allow you to store, organize and search your documents. Web interface, api, it can also read emails (via the “gotenburg” plugin). It can watch folders for new docs to process. It has features for structuring, self-improving (without AI). Tags. And you can have workflows.

Nice. Documents can go to Paperless. But he still has his bookkeeping system (he has his own company). And the bookkeeper wants emails with documents that are in Paperless. Can he improve this? For instance for receipts. He didn’t want to scan all of them to PDF. And regular phone cameras don’t produce PDFs.

He started using “dropbox camera”. It works great for scanning receipts and documents. It recognizes corners and pages and enhances the contrast. It produces PDFs and uploads them to dropbox. (You must accept the fact that it ends up in the cloud: he build all this pre-Trump…)

He has a Synology NAS at home. That has a CloudSync app that you can use to sync the dropbox folder to the NAS.

He wanted to make some python glue gode. Ability to send to multiple destinations. Process folders for new files. Moving files to a “done” folder. Python looks at the various folders: he configured a specific custom “processor” per folder. So a move-to-paperless processor, for instance. And a processor that emails the scanned receipts directly to the bookkeeper.

Lots of it is automated. Just drop a PDF in a folder and the system takes care of it. Once in a while he checks Paperless and categorizes/stores what’s left in the inbox.

It was a personal project, so he used it to experiment with Dataclass and Protocol. Don’t forget to learn when you create/automate something for yourself.

He finds it awesome that something this easy saves him hours! What can you automate in your life?

https://reinout.vanrees.org/images/2026/lac-de-kruth3.jpg

Unrelated photo: the “lac de Kruth-Wildenstein” reservoir during a family holiday in France in 2006.

PyGrunn: layered architecture - Mike Huls

2026-05-08

Tags: python, pygrunn

(One of my summaries of the 2026 one-day PyGrunn conference in Groningen, NL).

Full title: layered architecture for readable, robust, and extensible apps.

Note: there’s a related article on his own website :-)

Layered architecture resonates with people that make okay applications: their application do what they need to do. But once people start asking for changes, they get nervous. There might be huge functions. Or there might be no tests, “as it takes too much time to spin up the database”. Brittle applications. Small changes are disproportionally expensive.

The goal of this talk: create apps that are readable, robust and extensible. By using the principle of separating everything in layers with a specific responsibility. It is not a one-size-fits-all solution: you have to adapt it to your situation.

The layers that he proposes:

  • Interface: how the ouside world calls your application. An API or UI.

  • Infrastructure and Repository: your contact with the outside world (like a database).

    • Infrastructure is tools. A http client. A mail sender.

    • Repository: persistence. SQL queries, caches. The aim is to decouple the rest of the system from db/cache/etc.

  • Application: heart of your system, orchestrating the business logic. The Interface talks to the Application layer, the Application layer talks the infra/repo. And uses the Domain layer.

  • Domain: constraints and definitions. He often uses Pydantic models here. It reflects the business meaning. It should be strict. Fail early. The “language” used should be a shared language between the engineers and the business people.

There are some rules, like the Interface only talks to the Application, not directly to the Infrastructure. And your code should be structured the same way. So a repo/ dir, an infra/ dir etc.

What are the benefits?

  • It is more readable, you know where stuff is. This also helps with onboarding.

  • It is more understandable, also to business people.

  • Your app will be much more maintainable.

  • Structure is clearer.

  • Because you have more separation between concerns, validation is easier, so you tend to do more of it.

  • Evolvable. You can build upon your existing code instead of modifying it.

How to get started?

  • Start with separate directories. If you wonder where a function should go, it probably has too many responsibilities :-)

  • Add tests.

  • Start small.

  • Focus on validation. Fail early.

  • Isolate the business logic.

  • Concentrate on the borders and separations.

Something to watch out for is making your models too big. You might have to split it into separate systems with their own responibility. A payment system, separate from the inventory system, for instance. You might want to create a small, focused shared domain system.

https://reinout.vanrees.org/images/2026/lac-de-kruth1.jpg

Unrelated photo: the “lac de Kruth-Wildenstein” reservoir during a family holiday in France in 2006.

Djangocon EU: Django templates on the frontend? - Christophe Henry

2026-04-17

Tags: django, djangocon

(One of my summaries of the 2026 Djangocon EU in Athens).

It all started with formsets: you generate a new form based on other forms. You can use it to create pretty fancy forms. But your designer can get quite creative. And you might have variable forms that have to react to user input.

A common solution is to use htmx, but that means server requests all the time. And some users have really bad connections. Regular requests aren’t handy in that scenario.

He looked at django-rusty-templates: Django’s template engine implemented in Rust. It had a template parser that he could re-use. With OXC (javascript oxidation compiler) he converted that to javascript.

That way, he could offload much of the django form creation handling to the frontend, including reacting to user input and showing alerts.

The work-in-progress project is called django-template-transpiler: https://github.com/christophehenry/django-template-transpiler . Don’t use it for production.

https://reinout.vanrees.org/images/2026/kat7.jpeg

Unrelated photo explanation: a cat I encountered in Athens on an evening stroll in the neighbourhood behind the hotel.

Djangocon EU: zero-migration encryption - Vjeran Grozdanic

2026-04-17

Tags: django, djangocon

(One of my summaries of the 2026 Djangocon EU in Athens).

Full title: zero-migration encryption: building drop-in encrypted field in Django.

He works at Sentry. Huge site with a Django backend and thousands requests per second.

He had to add a new table to store 3rd party API credentials. Oh: should this be encrypted? Yes. But: each team has its own way to encrypt data. And there were at least 10 encryption keys here and there (as environment variables). And tens of places where encryption/decryption happens.

So: better to build a generic solution. Or use an existing generic solution. And yes, there are multiple libraries. EncryptedCharField looked nice. But the problem was all the existing data in the various places. Sentry is not a site that you can shut down for a while, so you have to do it with zero downtime. This means you can never change an existing column type.

A solution could be to add a new encrypted field next to the existing one. Then fill it and backfill it and make sure no new data is written to the old field and then you can remove the old field. But that’s quite a job with all the different locations that had to be changed.

A Field class in Django has get_prep_value() and from_db_value(). Those are called before storing data in the database and after grabbing it from the database. You could create a new CharField-like field and start to encrypt values in get_prep_value and decrypt the other way.

You’d have to be able to recognise the old un-encrypted values. A solution: prefix encrypted values with enc:. Also key rotation can be handled this way, by including that in the prefix (enc:key2:).

But there’s also a bjson field. They solved that by encrypting the json and writing a json to the database with the encrypted json in a field and also the encryption key info.

The code is in the sentry repo .

https://reinout.vanrees.org/images/2026/kat2.jpeg

Unrelated photo explanation: a cat I encountered in Athens on an evening stroll in the neighbourhood behind the hotel.

Overview by year

Statistics: charts of posts per year and per month.