Reinout van Rees’ weblog

Write drunk, test automated: documentation quality assurance - Sven Strack

2018-12-03

Tags: writethedocs, plone, python

This is my summary of the write the docs meetup in Amsterdam at the Adyen office, november 2018.

Sven’s experience is mostly in open source projects (mainly Plone, a python CMS). He’s also involved in https://testthedocs.org and https://rakpart.testthedocs.org, a collection of tools for documentation tests. He has some disclaimers beforehand:

  • There is no perfect setup.
  • Automated checks can only help up to a certain level.
  • Getting a CI (continuous integration) setup working is often tricky.

Before you start testing your documentation, you’ll need some insight. Start with getting an overview of your documentation. Who is committing to it? Which parts are there? Which parts of the documentation are updated most often? Are the committers native speakers yes/no? Which part of the documentation has the most bug reports. So: gather statistics.

Also: try to figure out who reads your documentation. Where do they come from? What are the search terms they use to find your documentation in google? You can use these statistics to focus your development effort.

Important: planning. If your documentation in English, plan beforehand if you want it to be in UK or US English. Define style guides. If you have automatic checks, define standards beforehand: do you want a check to fail on line length yes/no? Spelling errors? Etc. How long is the test allowed to take?

A tip: start your checks small, build them up step by step. If possible, start from the first commit. And try to be both strict and friendly:

  • Your checks should be strict.
  • Your error messages should be clear and friendly.

In companies it might be different, but in open source projects, you have to make sure developers are your friends. Adjust the documentation to their workflow. Use a Makefile, for instance. And provide good templates (with cookiecutter, for instance) and good examples. And especially for programmers, it is necessary to have short and informative error messages. Don’t make your output too chatty.

Standards for line lengths, paragraphs that are not too long: checks like that help a lot to keep your documentation readable and editable. Also a good idea: checks to weed out common English shortcuts (“we’re” instead of “we are”) that make the text more difficult to read for non-native speakers.

Programmers are used to keeping all the code’s tests pass all the time. Merging code with failing tests is a big no-no. The same should be valid for the automatic documentation checks. They’re just as important.

Some further tips:

  • Protect your branches, for instance, to prevent broken builds from being merged.
  • Validate your test scripts, too.
  • Keep your test scripts simple and adjustable. Make it clear how they work.
  • Run your checks in a specific order, as some checks might depend on the correctness as checked by earlier tests.
  • You can also check the html output instead of “only” checking the source files. You can find broken external links this way, for instance.

Something to consider: look at containerization. Dockers and so. This can run on both your local OS and on the continuous integration server. Everyone can use the same codebase and the same test setup. Less duplication of effort and more consistent results. The most important is that it is much easier to set up. Elaborate setups are now possible without scaring away everybody!

For screenshots, you could look at puppeteer for automatically generating your screenshots.

A comment from the audience: if you have automatic screenshots, there are tools to compare them and warn you if the images changed a lot. This could be useful for detecting unexpected errors in css or html.

Another comment from the audience: you can make a standard like US-or-UK-English more acceptable by presenting it as a choice. It is not that one of them is bad: we “just” had to pick one of them. “Your own preference is not wrong, it is just not the standard we randomly picked”. :-)

Amsterdam Python meetup, november 2018

2018-11-30

Tags: python, pun

My summary of the 28 november python meetup at the Byte office. I myself also gave a talk (about cookiecutter) but I obviously haven’t made a summary of that. I’ll try to summarize that one later :-)

Project Auger - Chris Laffra

One of Chris’ pet projects is auger, automated unittest generation. He wrote it when lying in bed with a broken ankle and thought about what he hated most: writing tests.

Auger? Automated Unittest GEneRator. It works by running a tracer

The project’s idea is:

  • Write code as always
  • Don’t worry about tests
  • Run the auger tracer to record function parameter values and function results.
  • After recording, you can generate mocks and assertions.

“But this breaks test driven development”!!! Actually, not quite. It can be useful if you have to start working on an existing code base without any tests: you can generate a basic set of tests to start from.

So: it records what you did once and uses that as a starting point for your tests. It makes sure that what ones worked keeps on working.

It works with a “context manager”. A context manager normally has __enter__() and __exit__(). But you can add more interesting things. If in the __enter__()` you call sys.settrace(self.trace), you can add a def trace(self, frame, event, args), which is then fired upon everything that happens within the context manager. You can use it for coverage tracking or logging or visualization of what happens in your code. He used the last for algorithm visualizations on http://pyalgoviz.appspot.com/

So… this sys.settrace() magic is used to figure out which functions get called with which parameters.

Functions and classes in the modules you want to check are tested, classes from other modules are partially mocked.

Python LED animation system BiblioPixel - Tom Ritchford

Bibliopixel (https://github.com/ManiacalLabs/BiblioPixel) is his pet project. It is a python3 program that runs on basically everything (raspberry pi, linux, osx, windows. What it does? It controls large numbers of lights in real-time without programming.

There are lots of output drivers form led strips and philips hue to an opengl in-browser renderer. There are also lots of different ways to steer it. Here is the documentation.

He actually started on a lot of programs having to do with audio and lights and so. Starting with a PDP-11 (which only produced beeps). Amiga, macintosch (something that actually worked and was used for real), java, javascript, python + C++. And now python.

The long-term goal is to programmatically control lights and other hardware in real time. And… he wants to define the project in text files. The actual light “program” should not be in code. Ideally, bits of projects ought to be reusable. And any input ought to be connectable to any output.

Bibliopixel started with the AllPixel LED controller which had a succesful kickstarter campaign (he got involved two years later).

An “animation” talks to a “layout” and the layout talks to one or more drivers (one could be a debug visualization on your laptop and the other the real physical installation). Animations can be nested.

Above it all is the “Project”. A YAML (or json) file that defines the project and configures everything.

Bibliopixel is quite forgiving about inputs. It accepts all sorts of colors (red, #ff0000, etc). Capitalization, missing spaces, extraneous spaces: all fine. Likewise about “controls”: a control receives a “raw” message and then tries to convert it into something it understands.

Bibliopixel is very reliable. Lots of automated tests. Hardware test boards to test the code with the eight most common types of hardware. Solid error handling and readable error messages help a lot.

There are some weak points. The biggest is lack of developers. Another problem is that it only supports three colors (RGB). So you can’t handle RGBW (RGB plus white) and other such newer combinations. He hopes to move the code over completely to numpy, that would help a lot. Numpy is already supported, but the existing legacy implementation currently also still needs to be work.

He showed some nice demos at the end.

Utrecht (NL) python meetup september 2018

2018-09-26

Tags: python

Data processing using parser combinators - Werner de Groot

He collaborated with data scientists from Wageningen University. The scientists did lots of cool programming stuff. But they did not use version control, so they introduced git :-) They also had lots and lots of data, so they introduced Apache Spark.

Their data sets were in ascii files, which are huge. So the ascii files need to be parsed. He showed an example of a file with DNA data. Ouch, it turns out to be pretty complex because there are quite some exceptions. Fields (in this example) are separated by semicolons. But some of the values also contain semicolons (quoted, though). So the generic python code they used to parse their DNA data was littered with “if” statements. Unmaintainable.

You probably heard that hard problems need to be split up into smaller problems. Yes, that’s true. But the smaller parts also need to be combined again. So: smaller parsers + a combining them again.

A parser takes an input and returns the part that matched and the part that remained.

He showed parsy as a library for “parser combinators”. Many languages have such libraries. He demonstrated how to combine those input/match/remainder parsers into a kind of pipeline/sequence. Such a sequence of parsers can be treated as a parser in its own right. This makes designing and nesting them easy.

When combining parsers, you of course need to handle variants: an “or” operator handles that.

Someone asked about “yacc” parsers, which are able to handle robustly handle each and every corner case, “how does it compare to the simpler ‘parsy’”. The answer: parsy is a simple library, there are more elaborate python libraries. But: parsy is already quite good. A json parser written in parsy takes only 50 lines!

He did a live demo, developing a set of combined parser step by step. Fun! And very understandable. So “parsy” sounds like a nice library for this kind of work.

There were some comparison-questions to regular expressions. Werner’s answer was that parsy’s kind of parsers are much more readable and debuggable. He was surprised at the amount of attendees that like regular expressions :-)

The nice thing: every individual part of your parser (“just some numbers”, “an equals sign”) is a piece of python, so you can give it a name. This way, you can give those pieces of parser names from your domain like dnaName, type, customer).

In the end, he live-coded the whole DNA ascii file parser. Quite boring. And that was his whole point: what would be hard or impossible to do in plain python becomes “just boring” with parsy. Exactly what we want!

A practical application of Python metaclasses - Jan-Hein Bührman

(See an earlier summary about metaclasses being used in django)

Apart from metaclasses, he showed some utilities that he likes: pipenv, pylint, mypy.

A nice touch to his presentation: he had his example code all in separate branches. Instead of live coding, he just switched branches all the time. Because he gave his branches clear names, it worked quite well!

The example he build up is impossible to summarize here. The example included a register function that he had to call on certain classes. He didn’t like it. That’s where metaclasses come in.

Python objects are instances of classes. Classes themselves are instances of type. You can create classes programmatically by doing something like:

>>> B = type('B', (), {})
>>> b = B()
>>> type(b)
<class 'B'>

Normally, when python imports a module (= reads a python file), class statements are executed and the class is created. You can influence that process by adding a __new__ method.

  • __init__() influences the creation of objects from the class (= instantiating the object from the class).
  • __new__() influences the creation of the class (= instantiating the class from type).

He used it to automatically register objects created from classes with the metaclass.

Note: in python 3.6, __init_subclass__() was added that really makes this much easier.

Plain text pypi description formatting: possible cause

2018-07-04

Tags: python

Sometimes projects have a plaintext description on pypi.org. You see the restructuredtext formatting, but it isn’t applied. See my own z3c.dependencychecker 2.4.2 for example.

I use an editor with restructuredtext syntax highlighting. I double-checked everything. I used docutils. I used pypi’s own readme_renderer to check it. Not a single error.

But still, after some tries, the formatting was still off. Then a fellow contributor adjusted one of the setup() keywords in our setup.py: something might have to be a string (as in another package, with a perfectly-rendering description) instead of a list (as we had). Sadly, no luck.

But it made me investigate the other keyword arguments. I noticed, next to long_description, the description. This is normally a one-line description. In our case, it was a multi-line string:

long_description = u'\n\n'.join([ ... some files .... ])

description = """
Checks which imports are done and compares them to what's
in setup.py and warns when discovering missing or unneeded
dependencies.
"""

setup(
    name='z3c.dependencychecker',
    version=version,
    description=description,
    long_description=long_description,

I changed it to a single line and yes, the next release worked fine on pypi!

(Note that I also toyed a bit with encodings in that pull request, but I don’t think that had any influence).

So: take a look at the description field if your project also has rendering problems (assuming that your long_description is fine).

Ansible provision/deploy setup

2018-05-31

Tags: python, django

I never got around to write down the ansible setup I figured out (together with others, of course, at my previous job) for deploying/provisioning django websites.

The whole setup (as a cookiecutter template) can be found in https://github.com/reinout/cookiecutter-djangosite-template . The relevant code is in the ansible/ directory. Note: this is a “cookiecutter template” from which you can generate a project so you’ll see some {{ }} and {% %}: when you create the actual project these items will be filled in.

My goal was to keep the setup simple and safe. With “safe”, I mean that you normally cannot accidentally do something wrong.

And it was intended for getting one website project onto the server(s). It is not a huge ansible setup to set up the whole infra in one fell swoop.

First the inventory files:

  • Yes, multiple: there are two of them. production_inventory and staging_inventory. The safe aspect is that we used to have a single inventory file with [production] and [staging] headings in it. If you wanted to update staging, you had to add --limit staging on the command line. If you forgot that…

    With two separate files, you never have this problem. Accidents are much less likely to occur.

  • The simple aspect is that variables are right there in the inventory. It are only a few variables after all: servername, hostname, checkout (master or tag).

    A “problem” with ansible is that you can place variables in lots of places (playbook, inventory, host variable file, group variable file and another two or three I don’t remember right away). So where to look? I figured that the inventory was the right place for this kind of simple setup:

    [web]
    p-web-1234.internal.network
    
    [web:vars]
    site_name=myproject.example.org
    checkout_name="1.2"
    

Second, the ansible playbooks:

  • There are two. A provision.yml and a deploy.yml. Provisioning is for one-time setup (well, you probably want to change stuff sometimes, but you’ll rarely run this one). Deploy is for the regular deploys of the actual code. The stuff you regularly do.

  • Provision should be run as a user that can do sudo su on the target machine. Provisioning installs packages and adds the nginx config file in /etc/. And it creates a user for running/deploying the django site (in my previous job, this user was historically called buildout).

  • The deploy playbook connects as the abovementioned deploy user, does a “git pull” (or whatever deploy mechanism you prefer), runs the database migration and so.

  • Now, how can the person who deploys connect as that deploy user? Well, the provision playbook creates the deploy user (“buildout”), disables the password and adds the public ssh keys of the deployers to the /home/buildout/.ssh/authorized_keys file:

    - name: Add user "buildout" and set an unusable password.
       user: name=buildout password='*' state=present shell="/bin/bash"
    
     - name: Add maintainers' ssh keys so they can log in as user buildout.
       authorized_key: user=buildout key=https://github.com/{{ item}}.keys
       with_items:
         - reinout
         - another_colleague
    

    It is simple because you only need a very limited number of people on any server with sudo rights. Or a very limited number of people with the password of a generic admin account. Re-provisioning is basically only needed if something changed in the nginx config file. In practice: hardly ever.

    It is simple because you don’t need to give the deployers each a separate user account (or access to a password): their public ssh key is enough.

    It safe because it limits the (root-level) mistakes you can do during regular deploys. And the small amount of users you need on your system is also an advantage.

  • The ansible playbooks are short. Just a couple of tasks. Even though I’m normally all in favour of generic libraries and so: for a 65 line playbook I personally don’t need the mental overhead of one or two generic ansible roles.

    The playbooks do basically the same thing I did years earlier with “Fabric” scripts. So: the basic structure has quite proven itself (for me).

There are all sorts of approaches you can take. Automatic deploys from your Jenkins or Gitlab, for instance. That’d be something I like to do once. For not-automatically-deployed projects, a simple setup such as I showed here has much to recommend itself.

Djangocon: friday lightning talks

2018-05-25

Tags: djangocon, django

(One of my summaries of a talk at the 2018 European djangocon.)

The stenographers - Sheryll and Andrew

The stenographers are the ones that provide LIVE speech to text subtitles for the talks. Wow.

They use “shorthand machines” for steno. It is like a piano where you press multiple keys to form words. On wednesday the speakers talked 46000 words…

How fast do people talk? Anything between 180 and 250, though 300 also occurs. The handiest are speakers that speak in a regular tempo. And not too fast.

They ask presenters for notes and texts beforehand. That helps the software pick the right words.

Pytest-picked - Ana Paula Gomes

Say you have a codebase that has tests that take a long time. You’ve just changed a few files and don’t want to run the full test before the commit. You could run git status and figure out the files to test.

But you can do it automatically with pytest-picked.

It is a small plugin, but she wants to improve it with better guessing and also with support for testing what changed on a branch.

How to build a treehouse - Harry Biddle

A talk about building actual treehouses!

One of the craziest is “Horace’s cathedral”, a huge one that was closed by the fire brigade for fire risks…

If you’re going to do it, be nice to your tree. Look at “garnier bolt” for securing your tree house.

Give the tree room to grow.

Recommended book: “the man who climbs trees”.

What about DRF’s renderers and parsers - Martin Angelov

They started with just plain django. Then they added DRF (django rest framework). Plus react.

Then it became a single page app with django, drf, react, redux, sagas, routers, etc. (Well, there not quite there yet).

Python and javascript are different. Python uses snake_case, javascript uses camelCase. What comes out of django rest framework also often is snake_case, which looks weird in javascript.

They tried to fix it.

  • JS utils. On-the-fly translation. Hard when debugging.
  • React middleware.
  • Then he used a custom CamelCase renderer in django rest framework :-)

Git Menorahs considered harmful - Shai Berger

A bit of Jewish tradition: a menorah is a 7-armed candle holder (a temple menorah) or 9 armed (hanukkah menorah).

What do we use version control for?

  • To coordinate collaborative work.
  • To keep a record of history.

But why do we need the history? To fix mistakes. Figure out when something was broken. Reconsider decisions. Sometimes you need to revert changes.

What do we NOT need in the history? The actual work process. Commits for typo fixes, for instance.

A git menorah is a git repo with many branches and changes. Bad. Use git rebase to reshape the history of your work before pushing. Follow the example of the django git repository.

5 minutes for your mental health - Ducky

What is really important for your mental health?

  • Sleep!
  • Exercise.

As weird as it sounds, you can ‘feel’ in your body how you’re feeling. There are some exercises you can do. Close your eyes, put your mind in your fingertips. Slowly move inside your body. Etc. “Body scan”.

Meta programming system, never have syntax errors again - Ilja Bauer

MPS is a system for creating new custom languages. It has a “projectional editor”.

A normal editor shows the code, which is converted to an AST (abstract syntax tree) and it gets byte-compiled.

A projectional editor has the AST as the basis and “projects” it as .py/java/whatever files purely as output. The advantage is that renaming stuff is easier.

Save ponies: reduce the resource consumption of your django server - Raphaël Barrois

He works on a big django project. 300 models. Startup time is 12 seconds… Memory consumption 900MB. The first query afterwards after 4 seconds.

How dows it work? They use uwsgi. There is a simple optimization. By default, uwsgi has lazy = true. This loads apps in workers instead of in the master. He disabled it. It started up much slower, but the first query could then be handled much quicker.

Another improvement. They generate a fake (test) request and fire it at the server upon startup. So before the workers are started, everything is already primed and active. This helps a lot.

Djangocon: don’t look back in anger, the failure of 1858 - Lilly Ryan

2018-05-25

Tags: djangocon, django

(One of my summaries of a talk at the 2018 European djangocon.)

Full title of the talk: “don’t look back in anger: Wildman Whitehouse and the great failure of 1858”. Lilly is either hacking on something or working on something history-related.

“Life can only be understood backwards; but it must be lived forwards – Søren Kierkegaard” So if you make mistakes, you must learn from it. And if something goes right, you can party.

The telegraph! Old technology. Often used with morse code. Much faster than via the post. Brittain and the USA had nation-wide telegraph systems by 1858. So when the first trans-atlantic cable was completed, it was a good reason for a huge celebration. Cannon, fireworks. They celebrated for three weeks. Until the cable went completely dead…

Normally you have a “soft lauch”. You try out if everything really works for a while. But they didn’t do this in case. They could only imagine succes…

Failures aren’t necessarily bad. You can learn a lot from it. But at that time you didn’t have agile retrospectives. An agile retrospective at that time was probably looking over your shoulder while you sprinted along the highway, chased by highway robbers…

Now on to Wildman Whitehouse. It looked like he had done a lot of inventions, but most of them were slight alterations on other peoples’ work. So basically “fork something on github, change a single line and hey, you have a new repo”.

But it all impressed Cyrus Field, a businessman that wanted to build a transatlantic cable. Undeterred by the fact that electricity was quite new and that it seemed technically impossible to build the thing. He hired Whitehouse to build it.

Another person, a bit more junior than Whitehouse, was also hired: William Thomson. He disagreed on most points with Whitehouse’s design (and electrical theory). But there was no time to do any discussion, so the project was started to Whitehouse’s design.

The initial attempts all failed: the cable broke. On the fourth try, it finally worked and the party started.

Now Thomson and Whitehouse started fighting over how to operate the line. Whitehouse was convinced that you had to use very high voltages. He actually did this after three weeks and fried the cable.

Time for reflection! But not for Whitehouse: everyone except him was to blame. But the public was terribly angry. There was an official “retrospective” with two continents looking on. Thomson presented the misgivings he shared beforehand. Engineers told the investigators that Whitehouse actually personally (illegally) raised the voltage. So it soon was clear that there was one person to blame… Whitehouse.

Whitehouse was indignant and published a pamflet… And nobody believed him and he stormed off to his room.

Some lessons for us to learn:

  • Open-mindedness is crucial. You must allow your ideas to be questioned.

  • Treat feedback sensitively. People get defensive if they are attacked in public.

    Before you do a retrospective, make sure it is clear that everyone did their best according to the knowledge in the room

  • Remember the prime directive.

  • There is no room for heroes in the team. It makes for a terrible team environment.

How did it end? They let William Thomson take over the management. The design and principles were changed. Also good: Thomson was nice to work with. Still, some things went wrong, but they were experienced enough to be able to repair it. After some test period, the cable went into operation.

https://farm1.staticflickr.com/973/42307963511_3a6770981a_z_d.jpg

Photo explanation: station signs on the way from Utrecht (NL) to Heidelberg (DE).

Djangocon: banking with django, how not to lose your customers’ money - Anssi Kääriänen

2018-05-25

Tags: djangocon, django

(One of my summaries of a talk at the 2018 European djangocon.)

He works for a company (holvi.com) that offers business banking services to microentrepreneurs in a couple of countries. Payment accounts (online payments, prepaid business mastercard, etc). Business tools (invoices, online shop, bookkeeping, etc).

Technically it is nothing really special. Django, django rest framework, postgres, celery+redis, angular, native mobile apps. It runs on Amazon.

Django was a good choice. The ecosystem is big: for anything that you want to do, there seems to be a library. Important for them: django is very reliable.

Now: how not to lose your customers’ money.

  • Option 1: reliable payments.
  • Option 2: take the losses (and thus reimburse your customer).

If you’re just starting up, you might have a reliability of 99.9%. With, say, 100 payments per day and 2 messages per payment, that’s 1 error case per day. You can handle that by hand just fine.

If you grow to 10.000 messages/day and 99.99% reliability, you have 5 cases per day. You now need one or two persons just for handling the error cases. That’s not effective.

Their system is mostly build around messaging. How do you make messages reliable?

  • The original system records the message in the local database inside a single transaction.

    In messaging, it is terribly hard to debug problems if you’re not sure whether a message was send or what the contents were. Storing a copy locally helps a lot.

  • On commit, the message is send.

  • If the initial send fails: retry.

  • The receiving side has to deduplicate, as it might get messages double.

You can also use an inbox/outbox model.

  • Abstract messages to Inbox and Outbox django models.
  • The outbox on the origin system stores the messages.
  • Send on_commit and with retry.
  • Receiving side stores the messages in the Inbox.
  • There’s a unique constraint that makes sure only unique messages are in the Inbox.
  • There’s a “reconcialation” task that regularly compares the Outbox and the Inbox, to see if they have the same contents.

For transport between inbox and outbox, they use kafka, which can send to multiple inboxes.

There are other reliability considerations:

  • Use testing and reviews.
  • If there’s a failure: react quickly. This is very important from the customer’s point of view.
  • Fix the original reason, the core reason. Ask and ask and ask. If you clicked on a wrong button, ask why you clicked on the wrong button. Is the UI illogical, for instance?
  • Constantly monitor and constantly run the reconciliation. This way, you get instant feedback if something is broken.
https://farm1.staticflickr.com/881/42260938022_ce85326961_z_d.jpg

Photo explanation: station signs on the way from Utrecht (NL) to Heidelberg (DE).

Djangocon: an intro to docker for djangonauts - Lacey Williams Henschel

2018-05-25

Tags: djangocon, django, python

(One of my summaries of a talk at the 2018 European djangocon.)

https://imgs.xkcd.com/comics/containers.png

Docker:

  • Nice: it separates dependencies.
  • It shares your OS (so less weight than a VM).
  • It puts all memmbers on the same page. Everything is defined to the last detail.

But: there is a pretty steep learning curve.

Docker is like the polyjuice potion from Harry Potter. You mix the potion, add a hair of the other person, and you suddenly look exactly like that other person.

  • The (docker) image is the person you want to turn into.
  • The (docker) container, that is you.
  • The Dockerfile, that is the hair. The DNA that tells exactly what you want it to look like. (She showed how the format looked).
  • docker build actually brews the potion. It builds the image according to the instructions in the Dockerfile.

Ok. Which images do I have? Image Revelio!: docker images. Same with continens revelio: docker container ls.

From that command, you can grap the ID of your running container. If you want to poke around in that running container, you can do docker exec -it THE_ID bash

Stop it? Stupefy! docker stop THE_ID. But that’s just pause. If it is Avada kedavra! you want: docker kill THE_ID.

Very handy: docker-compose. It comes with docker on the mac, for other systems it is an extra download. You have one config file with which you can start multiple containers. It is like Hermione’s magic bag. One small file and you can have it start lots of things.

It is especially handy when you want to talk to, for instance, a postgres database. With two lines in your docker-compose.yml, you have a running postgres in your project. Or an extra celery task server.

Starting up the whole project is easier than with just plain docker: docker-compose up! Running a command in one of the containers is also handier.

The examples are at https://github.com/williln/docker-hogwarts

https://farm1.staticflickr.com/882/41406687805_7a334e427d_z_d.jpg

Photo explanation: station signs on the way from Utrecht (NL) to Heidelberg (DE).

Djangocon keynote: the naïve programmer - Daniele Procida

2018-05-25

Tags: djangocon, django, python

(One of my summaries of a talk at the 2018 European djangocon.)

The naïve programmer is not “the bad programmer” or so. He is just not so sophisticated. Naïve programmers are everywhere. Almost all programmers wish they could be better. Whether you’re

Programming is a craft/art/skill. That’s our starting point. Art can be measured against human valuation. In the practical arts/crafts, you can be measured against the world (if your bridge collapses, for instance).

Is your craft something you do all the time, like landing a plane? Or are you, as programmer, more in the creative arts: you face the blank canvas all the time (an empty models.py).

In this talk, we won’t rate along the single axis “worse - better”. There are more axes. “Technique - inept”, “creative - dull”, “judgment - uncritical” and sophistication - naïve. It is the last one that we deal with.

What does it mean to be a sophisticated programmer? To be a real master of your craft? They are versatile and powerful. They draw connections. They work with concepts and ideas (sometimes coming from other fields) to think about and to explain the problems they have to solve.

The naïve programmer will write, perhaps, badly structured programs.

But… the programs exist. They do get build. What should we make of this?

He showed an example of some small-town USA photographer (Mike Disfarmer). He worked on his own with old tools. He had no contacts with other photographers. Just someone making photo portraits. Years after his death his photos were discovered: beautifully composed, beautifully lighted (though a single skylight…).

Software development is a profession. So we pay attention to tools and practices. Rightfully so. But not everyone is a professional developer.

Not everyone has to be a professional programmer. It is OK if someone learns django for the first time and builds something useful. Even if there are no unit tests. Likewise a researcher that writes a horrid little program that automates something for him. Are we allowed to judge that?

He talked a bit about mucisians. Most of them sophisticated and very good musicians. But some of them also used naïvity. Swapping instruments, for instance. Then you make more mistakes and you play more simply. Perhaps you discover new things that way. Perhaps you finally manage to get out of a rut you’re in.

Some closing thoughts:

  • Would you rather be a naïve programmer with a vision or a sophisticated programmer without?
  • If you want to be a professional developer, you should try to become more sophisticated. That is part of the craft.
  • If you’re naïve and you produce working code: there’s nothing wrong with being proud of it.
  • As a sophisticated programmer: look at what the naïve programmer produces. Is there anything good in it? (He earlier showed bad work of a naïve French painter; his work was loved by Picasso.)

Suggestion: watch the keynote on youtube, this is the kind of talk you should see instead of read :-)

https://farm1.staticflickr.com/951/41586261404_3230589073_z_d.jpg

Photo explanation: station signs on the way from Utrecht (NL) to Heidelberg (DE).

 
vanrees.org logo

About me

My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):