Djangocon: automated spell checking in django projects - Jakob Schnell

Tags: djangocon, django, python

(One of my summaries of a talk at the 2018 European djangocon.)

We are humans and we make typos. So there are typos in our code.

The two common places for typos are documentation and the user interface.

Documentation is normally only provided in a single language and it are large text files, so spell checking is relatively easy. Django documentation is often build with Sphinx. For that, there is a sphinx extension: sphinxcontrib-spelling. You can even integrate it in your CI as a post-build check.

There will be words that are correct for your project but that aren’t in the regular dictionary: for that there’s a local “wordlist” you can use.

For code (and our GUI), it gets more complicated. You would have to read python code, css code, html code, javascript code… hard.

But there is a solution! Translations. Once your project gets bigger, you probably want to start translating it. Once you have set up your translation mechanism (gettext), you have all your strings gathered into one place: the .po files. Hurray, now we can do spell checking.

Gettext is the standard mechanism in Django to deal with translations. You’ll see lines like from ... import ugettext as _ in your code.

He wrote a tool for it: potypo. polib + pyenchant = potypo. Polib can read and write the “gettext” *.po files. pyenchant is an interface to libenchant.

You’re probably familiar with ispell or aspell (an ispell that fits better to unicode). myspell is openenoffice’s spellchecker, hunspell is a variant on this. For English, “aspell” is probably best, for other languages “hunspell”. Libenchant is a library that wraps them all. And pyenchant provides a python apy to libenchant.

When you start using it, you’ll have to install language packages like myspell-de-de and aspell-en. Then add a bit of configuration and potypo can start check your spelling. If desired, it can fail your build in CI. You can also switch that off for specific languages (for instance if you’ve just started translating).

Wordlists? You have multiple languages, so wordlists can be in a directory. wordlists/en.txt, wordlists/de.txt. You can also put just a wordlist.txt inside the translations’ locale/de/ directory.

The “pytypo” project is quite new, but it is already used in several projects. Ideas, features, pull requests: everything is welcome!

Photo explanation: constructing a viaduct module (which spans a 2m staircase) for my model railway on my attic. logo

About me

My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):