(Some summaries of a talk at the February 2022 Rotterdam python meetup).
The meetup was partly live, partly online. The technical setup worked surprisingly well:
They had a microsoft Teams channel for the online folks.
A laptop was connected to that same channel and showed it on a big screen via a beamer in the room for the “live” folks.
The speakers had to connect to the Teams channel to show their slides both online and automatically also in the room.
A big microphone gave pretty good sound, even though it was some four meters away from the speaker.
Worked fine! Strange having a meeting without having to wrestle with hdmi adapters :-)
He showed https://github.com/VanOord/pandas-xlsx-tables
When using a jupyter notebook and pandas, you can easily load csv files and do stuff with it. Make nice graphs, for instance.
But… colleagues want xsl sheets…. So you can use a pandas xls exporter. But the output is a raw xls sheet. It works much better if you format the data in xls as a “table”: “format as table”. It sounds like it only formats it visually, but it actually figures out the headings and field types. You get proper sorting and so.
So he wrote a new exporter that exports it as a nice formatted table in excel. Much more useful.
Sometimes people change the structure of python packages you depend on. They themselves as a company also have this problem: you want to evolve your internal libraries to improve them, but you also want to keep using it all the time.
There is something called “codemods” for automated code refactoring. Fewer manual changes to your code in response to changed library code. There are two basic ways of doing this:
Dynamic checking: basically manually. Running tests, for instance, and looking at the results.
Static analysis: parse code and analyse the structure. You don’t run code, but analyse as good as possible. Python type hints help here a lot. You can get a warning “use a DateTime instead of a three-item tuple” out of the static analysis if a function got refactored to use a datetime intead of a year/month/day tuple.
Some static analysis examples: mypy for static type checking, pylint (code linting), bandit (security testing), black (enforces coding standards).
These static analysers often work with the “ast”, python’s build-in abstract syntax tree. There’s also a “cst”, the concrete syntax tree which you can find in “lib2to3” and “libcst”. Libcst has good documentation on what it is.
At his company, they ship “codemods” together with the changed libraries. It doesn’t work for all corner cases, but it works for a surprising amount of cases, actually. They wrote a command line tool that you could tell to run a certain update.
Rob works in civil engineering.
We have some 18.000 km of levees in the Netherlands. And we really need them. And… we need to assess them regularly! A key ingredient for calculating levee safety is soil information, the geometry of the levee and some extra parameters like expected water level.
Soil info is gathered by taking soil measurements. The standard “GEF” ascii files that are the output are of course easily read with python.
Levee geometry you can extract from height measurements. There’s really good data in the Netherlands and there are loads of python libraries to work with the raster data.
The parameters like river levels can be found in xls files and postgres databases. Again, there are python libraries for it.
Luckily, the standard program that is used for calculating the stability of levees has an api. Again: you can use python.
So… python can help you with a lot of things and help glue everything together.
But… look out for issues like data quality (BS in, BS out). And automatic calculations??? Engineers like to feel in control and don’t always want automation. Also a problem: management at companies that aren’t always very innovation-minded.
Some extra comments:
Don’t forget your tests.
Don’t forget documentation. Sphinx is great.
Python is great for super fast development.
Focus is hard. Python is nice, but there’s rust… unreal…. golang… flutter… Focus! Focus!
I also gave a talk which I’ll try to summarize :-)
At my company (Nelen & Schuurmans) we made a website for the Dutch government (Rijkswaterstaat), some 10 years ago. I helped build it. A website they used to visit all the municipalities along the mayor rivers in the Netherlands. Why? Well, the water levels keep increasing.
The website showed, for various scenarios, the height of the water in excess of the current levee/dike height. So a graph of excess height plotted against the length of the river.
Either the levee needs to be strengthened and increased in height (which isn’t always desirable, especially near towns)…
Or the river needs more room. A bigger floodplain by removing obstacles like disused brick factories near the river. Or moving a levee a bit back. Or re-using an old river arm as extra flood channel.
All those measures are pretty mayor civil engineering works, so you need buy-in from the municipalities and the people living there.
So the website showed the effect of the various measures. You could select them in the website and watch the graph with the excess height lower itself a little or a lot. That way, you could make clear which measures help a lot and which not.
Lots of measures were taken along the river Meuse (Maas) during the years. And… they were effective. In july 2021 lots of rainfall increased the water level to high levels, but… there were no mayor problems near the Meuse! I was happy to have contributed a bit.
But… on to the topic of the talk. The website was created some ten years ago with the intention of running it for three or four years. “Can we extend it for a year?”, “can we extend it for another year?”, “can we extend it one last time?”, “can we extend it for one really really last time?”. And last year again :-) There were quite some nods from the audience at this time: it sure happens a lot.
So you have an old django website running on an old python version on an old ubuntu server… How to update it? Often the ubuntu version on the server is older than the one on your laptop. You can try to get everything running with a newer ubuntu + newer django + newer python version, but that will lead to quite some frustration.
What’s better: an incremental approach. You can use docker to good effect for this.
First phase: pick a docker image matching the old ubuntu (or other linux variant) version on the server.
Add all the “apt-get” dependencies you’ve installed for the website.
Get the code running there, trying to pin as much as possible of the python dependencies to what’s actually on the server.
Do one update: update your django revision to the latest for your old
version. If you have an 1.11.20
, pick the latest 1.11.29
.
Then enable deprecation warnings by running python with -Wa
or by
setting the PYTHONWARNINGS=always
environment variable. Normally, you
don’t want those warnings, but in this case they give you handy instructions
on what to fix before you do the python or django update. The alternative
is to “just” upgrade and to have a non-starting site due to import errors
and so: then you have to figure it out yourself. Why not be lazy and use the
info that python/django wants to give you?
Now you’ve got a good, clean local representation of what’s on the server, ready for further upgrades.
Second phase: upgrade your linux. Ubuntu “xenial” to “bionic”, for instance.
This automatically means a newer python version. Check that the site still runs/builds.
Probably you need to upgrade one or more libraries to make sure it works with the newer python version.
Again: fix the deprecation warnings.
Such a ubuntu/python upgrade normally doesn’t result in many changes. The next phase does!
Third phase: upgrade django. In the talk I said you could move in one go from one django LTS (long term support) to the next LTS, provided you previously fixed all deprecation warnings. But… when looking at my latest upgrade project, I moved from 2.2 => 3.0 => 3.1 => 3.2. So… disregard what I said during the meeting and just do what’s in the django documentation :-)
You probably need to unpin and upgrade all your dependencies now. Dependencies normally don’t support many different django versions, so if your site is a bit older, these upgrades will be necessary.
Fix deprecation warnings again to get your project in a neat state.
Check that everything works, of course. This includes running the tests, also of course.
If you do your upgrade project in these three phases, each individual phase will be quite doable. The first phase often is the hardest if the project is already quite old.
Quick personal note: one day after the meetup, a (Dutch) come-work-at-my-company video was ready. I really like to show it here :-)
My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):