Reinout van Rees’ weblog

Fixing SSL certificate chains


Tags: python, django

This blog post applies when the following two cases are true:

  • Your browser does not complain about your https site. Everything seems fine.
  • Some other tool does complain about not finding your certificate or not finding intermediate certificates. What is the problem?

So: your browser doesn’t complain. Let’s see a screenshot:

Browser address bar with a nice green closed lock, so ssl is fine

Examples of the errors you can see

Some examples of complaining tools. First curl:

$ curl https://api.letsgxxxxxxx
curl: (60) SSL certificate problem: Invalid certificate chain
More details here:

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.

curl has the right error message: Invalid certificate chain.

Let us look at wget:

$ wget https://api.letsgxxxxxx
--2015-11-23 10:54:28--  https://api.letsgxxxxx
Resolving api.letsgxxxxxx...
Connecting to api.letsgxxxxxx||:443... connected.
ERROR: cannot verify api.letsgxxxxxx's certificate, issued by 'CN=COMODO RSA
  Domain Validation Secure Server CA,O=COMODO CA Limited,L=Salford,ST=Greater Manchester,C=GB':
  Self-signed certificate encountered.
To connect to api.letsgxxxxxx insecurely, use `--no-check-certificate'.

wget is right that it cannot verify .... certificate. But its conclusion Self-signed certificate encountered is less helpful. The certificate is not self-signed, it is just that wget has to treat it that way because the certificate chain is incorrect.

If you talk to such an https URL with java, you can see an error like this:
PKIX path building failed:
unable to find valid certification path to requested target

This looks quite cryptic, but the cause is the same. SunCertPathBuilderException: CertPath sure sounds like a path to a certificate that it cannot find.

A final example is with the python requests library:

>>> import requests
>>> requests.get('https://api.letsgxxxxxx')
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/requests/", line 69, in get
    return request('get', url, params=params, **kwargs)
  File ".../requests/", line 50, in request
    response = session.request(method=method, url=url, **kwargs)
  File ".../requests/", line 465, in request
    resp = self.send(prep, **send_kwargs)
  File ".../requests/", line 573, in send
    r = adapter.send(request, **kwargs)
  File ".../requests/", line 431, in send
    raise SSLError(e, request=request)
SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)

How to determine what’s wrong

So... you yourself discover the problem. Or a customer calls that he’s getting an error like this. Even though everything seems right if you test the https site in the browser.

Solution: go to

If that site says everything is completely right, then you’re done. If it still complains about something, you’ve got work to do.

Most of the checkmarks are probably green:

Green checkmarks in front of many common SSL checks

In cases like this, the problem is in the certificate chain at the bottom of the page. Here’s an example of one of our own sites from a few months ago:

Broken chain icon indicating the exact problem spot

Note the “broken chain” icon halfway. Just follow the chain from top to bottom. Everything has to be perfect. We start with the * which is issued by GeoTrust SSL CA - G2.

The certificate GeoTrust SSL CA - G2 in turn is issued by GeoTrust Global CA.

The problem: the next certificate in the chain is not about GeoTrust Global CA, but about GeoTrust SSL CA, which is different. Here the chain breaks. It does not matter that the fourth certificate is about the GeoTrust Global CA we were looking for. The chain is broken. The order in which the certificates are placed must be perfect.

After fixing the order of the certificates in our certificate file, the problem was fixed:

Chain icons indicating that the chain is unbroken

Why is a chain needed?

There are lots of certificates in the wild. All the browsers (and java, and your OS and...) often only store a handful (well, 20+) “root certificates”. All the other certificates have to trace their origin back to one of those root certificates.

That is where the intermediate certificates come in: they’re a cryptographically signed way to trace the validity of your certificate back to one of the known-good root certificates.

How to fix it

  • If you’re handling certificates yourself, you ought to know which files to edit. The main problem will be getting the right intermediary certificates from the issuing party. Often you only get “your” certificate, not the intermediary ones. Ask about it or google for it.

  • Often you won’t maintain those certificates yourself. So you have to get your hosting service to fix it.

    If you let someone else take care of the certificate, point them at and tell them to make sure that page is completely happy.

    In my experience (=three times in the last two years!) they’ll mail back with “everything works now”. But it still won’t work. Then you’ll have to mail them again and tell them to really check and probably provide screenshots.

Good luck!

Nginx proxying to nginx: getting gzip compression to work


Tags: python, django

At work we use gunicorn as our wsgi runner. Like many, gunicorn advises you to run the nginx webserver in front of it. So on every server we have one or more websites with gunicorn. And an nginx in front.

Nginx takes care, of course, of serving the static files like css and javascript. Some gzipping of the results is a very, very good idea:

server {
    listen 80;
    gzip on;
    gzip_proxied any;


Two notes:

  • The default is to only gzip html output. We also want javascript and json. So you need to configure gzip_types.

    (I copy-pasted this from one of my config files, apparently I needed three different javascript mimetypes... Perhaps some further research could strip that number down.)

  • gzip_proxied any tells nginx that gzipping is fine even for proxied requests.

Proxied requests? Yes, because we have a lot of servers and all external traffic first hits our main nginx proxy. So: we have one central server with nginx that proxies requests to the actual servers. So: nginx behind nginx:

server {
    listen   443;
    location / {
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_redirect off;
        proxy_pass http://some-internal-server-name/;
    ssl on;
    ssl_certificate ...

Pretty standard “I listen on 443/https and proxy it on port 80 to some internal server” setup.

Works like a charm. Only drawback: gzipping does not work.

The reason? nginx defaults, in this case.

  • The gzip module has a gzip_http_version configuration parameter with a default of 1.1.

    Which means that http 1.0 requests are not gzipped, only 1.1.

  • The proxy module has a proxy_http_version configuration parameter with a default of 1.0.

    Which means that proxied requests are send from the main proxy to the actual webserver with http 1.0.

These two don’t match. There are two solutions:

  • Set gzip_http_version 1.0 in the nginx configs on your webservers. This switches on gzip for the http 1.0 connections coming from the proxy.
  • Set proxy_http_version 1.1 on the main proxy so that it sends http 1.1 connections to the webservers.

My choice originally was to do the first one. But a bug report came in for another site and now I’ve switched it on on the main proxy so that all the sites get the benefit.

Note: you might want to make different choices. Perhaps you have a caching proxy halfway? Perhaps you want the main nginx on the proxy to do the gzipping for you? Etcetera. Check whether the above tips apply to your situation :-)

Buildout 2.5.0 has much nicer version conflict reporting


Tags: python, django, buildout

We use buildout for all our django projects. Nothing wrong with pip, but buildout has extension possibilities build-in (for creating directories, installing user crontabs, local development checkouts and many more) that are quite helpful. And it works better when you need to use system packages (gdal, mapnik, etc).

One area where buildout could use some improvement was the version conflict reporting. Let’s say you have pinned django to 1.6.6 (old project that I’ll upgrade to 1.8 this week) and you add the django debug toolbar. This is the error you get:

The constraint, 1.6.6, is not consistent with the requirement, 'Django>=1.7'.
  Updating django.
Error: Bad constraint 1.6.6 Django>=1.7

First things first. An easy one is to improve the wording of the message:

  Installing django.
Error: The requirement ('Django>=1.7') is not allowed by
your [versions] constraint (1.6.6)

Now... so there is some package that requires at least django 1.7. But which one? Buildout did not tell you. Which would mean you’d have to grep in all your requirements’ sub-requirements for which package actually requires the offending “django>=1.7”...

I’ve now added some internal logging that stores which package required which dependency. After an error occurs, the list is searched for possible matches.

With this change you’ll get a much more helpful output right before the error:

Installing django.
version and requirements information containing django:
  [versions] constraint on django: 1.6.6
  Base installation request: 'sso', 'djangorecipe'
  Requirement of djangorecipe==1.10: Django
  Requirement of djangorecipe==1.10: zc.recipe.egg
  Requirement of djangorecipe==1.10: zc.buildout
  Requirement of sso: django-nose
  Requirement of sso: django-mama-cas
  Requirement of sso: django-debug-toolbar
  Requirement of sso: django-auth-ldap
  Requirement of sso: Django<1.7,>=1.4.2
  Requirement of lizard-auth-server: django-nose
  Requirement of lizard-auth-server: django-extensions
  Requirement of lizard-auth-server: Django<1.7,>=1.6
  Requirement of django-nose: Django>=1.2
  Requirement of django-nose: nose>=1.2.1
  Requirement of django-mama-cas: requests==1.1.0
  Requirement of django-debug-toolbar: sqlparse
  Requirement of django-debug-toolbar: Django>=1.7
  Requirement of django-auth-ldap: python-ldap>=2.0
  Requirement of django-auth-ldap: django>=1.1
  Requirement of translations: Django>=1.4
  Requirement of django-extensions: six>=1.2
  Installing django.
Error: The requirement ('Django>=1.7') is not allowed by
your [versions] constraint (1.6.6)

This makes it much easier to spot the cause (in this case django-debug-toolbar).

There are some unrelated packages in here because I’m doing a textual comparison. The advantage is that it is very robust. And extracting the right package name from requirements without messing things up is harder to get right and takes more code.

So... if you use buildout, give version 2.5.0 a try!

Django under the hood: documentation workshop - Mikey Ariel


Tags: django, djangocon

(One of my summaries of a talk at the 2015 django under the hood conference).

Documentation used to be an afterthought of software delivery. Now it is a key component of the success of a software project.

Content strategy

Originally it is a marketing term. Which is fine, as documentation is an important part of your project’s marketing!

The core is asking the right questions (even is the answer is simple).

  • Who are my readers? Sounds like a simple question. But... are your readers advanced users? Or beginners? Do you need “persona-based” documentation, so documentation for specific groups (“admins”, “developers”, etc)?

  • What do my readers want to know? Often your readers need context before they can understand reference documentation. Do you need an end-to-end tutorial? Or just explanations?

    Does the textual content need to be enhanced with video or diagrams, for instance?

  • When do my readers need the content? Installation documentation right at the beginning to get started? A reference guide when you’re already working with it? Tutorials for learning it?

    “When” also relates to “when do I need/want to update the documentation?”

  • Where do my readers consume the content? Do you need a “man” page? Embedded help in your GUI app? Good, helpful error messages? Online documentation that can be found by google?

  • Why do my readers even need this content? Minimize double work. Can you point at another project’s documentation or do you need to describe some feature yourself?

    Similarly, if you need to add documentation to work around bugs or things that don’t work right yet: should you not actually fix the code instead?

DevOps for docs

“Content strategy” leverages marketing concepts to make your documentation better. Likewise, “devops for docs” leverages engineering for your documentation.

  • Look for a unified toolchain. If possible, use the same tools as the developers of the project you’re documenting (especially if you’re both the developer and the documenter). Git, for instance. Don’t use google docs if the project uses git. By using the same kind of system, everybody can help each other.

  • Use out of the box documentation tools like asciidoctor, gitbook, MkDocs, sphinx.

  • Use continuous integration! Automatic checker for broken links, perhaps an automatic spell checker, automatic builds (like read the docs does).

    There are automatic checkers like “Hemingway” that can be used as a kind of text unit test.

    You can add custom checks like making sure your project name is always spelled correctly.

  • Iterative documentation. Dividing the work into sprints for instance if it is documentation for a big project. Use your issue tracker or trello or something like that to manage it.

Keep in mind: we’re all in this together. Designers, developers, product managers, quality assurance, support engineers, technical writers, users.

Docs or it didn’t happen

Some ideas.

  • Treat docs as a development requirement. Write it down in your contribution guidelines. Write down what your definition of “documented” is.

  • Contribution guidelines are a good idea! They’re an important part of your documentation in itself. Do you want people to help you? Write those guidelines.

    With contrib guidelines you can also steer the direction and the atmosphere of your project. If you suggest that 20% of a participant’s time is spend mentoring new contributors, you send a strong message that you’re a welcoming and helpful community, for instance.

    Also look at non-code areas. Do you want contributions from designers? Do you explicitly like somone to work only on the documentation side of things?

  • Provide templates. “If you add a new feature, use this template as a basis.”. “Add a ‘version added’ link to your description.” That kind of helpful suggestions.

  • Contributing to documentation is a great (and often reasonably easy) way to get into contributing to a project as a whole.

  • Collaboration and training. Sprints and hackfests are great to get people started. There are communities and conferences. “Open help”, “write the docs”. Also mini-conferences inside bigger ones.

My recumbent bike in front of a station

Image: my recumbent bike in front of Troisvierges station in the north of Luxemburg, our startpoint this summer for cycling over the former ‘Vennbahn’ railway

water-gerelateerd Python en Django in het hartje van Utrecht!

Django under the hood: expressions - Josh Smeaton


Tags: django, djangocon

(One of my summaries of a talk at the 2015 django under the hood conference).

Josh Smeaton is a django core developer after his work on expressions.

What already existed for a long time in django are F expressions. There are used to send a computation to the database. A self-contained parcel of SQL. Like “take the price field and add the shipping costs to it”. Later aggregations were added. It is a bit the same, as it is “just a bit of sql” that gets send to the database.

Expressions in django are now much more refined. Multiple database backend support. Deep integration in the ORM to make writing expressions yourself in django easier. It almost makes .extra() and .raw() obsolete.

  • .raw() is for writing an entire query in SQL. For those corner cases where you need to do weird tricks that the ORM doesn’t support.
  • .extra() is for appending bits of SQL to the rest of your django query. It is evil and should go away.

Both are escape hatches that are hardly ever needed. One problem with them is that they are database backend specific.

Some examples of where you can use expressions:

.create(... username=Lower(username))
.annotate(title=F('price') + shipping)
order_by(Coalesce('last_name', 'first_name))

Batteries included! There are a couple of build-in functions like Coalesce, Concat, Lower, Upper.

Expressions can hide complexity. F(), Case(), When(). F() can refer to fields added by an aggregate, for instance. That goes much deeper than you could do with some custom .extra() SQL. And Case()/When() can be used to select different values out of the database depending on other values.

There are building bloks: Aggregate(), Func(), Value(). You can use those to make your own expressions.

Expressions in django 1.8 now has a proper public API with documentation. The espressions are composable: Sum(F('talk') + F('hold')) + Sum('wrap'). And the internals of the whole ORM are greatly simlified.

There’s one thing you can do with .extra() but not with expressions: custom joins. There are also still a few small bugs as the functionality is still pretty new. They’ve all been easy to fix till now.

He then showed some examples of using expressions and writing your own. Looks nice!

Steam loco loading coal

Image: German type-86 loco taking on coal on my in-progress ‘Eifelburgenbahn’ layout.

water-gerelateerd Python en Django in het hartje van Utrecht!

Django under the hood: documentation systems - Eric Holscher


Tags: django, djangocon

(One of my summaries of a talk at the 2015 django under the hood conference).

Note by Reinout: I hope all the examples are rendered OK as I’m writing this blog with... sphinx and restructured text :-)

Eric Holscher started a few years ago during a “django dash” weekend. That site is widely used and even gave rise to “write the docs” conferences and a community around technical writing.

Readthedocs and also the django documentation uses sphinx.

A documentation system? It is not just a single documentation page on one item. It is documentation for a whole project. Sphinx extends RST (restructured text) with additions for documenting software. It also adds semantic meaning.

Lot of documentation is written in markdown nowadays. A link to pep8 in markdown would be a simple link to pep8 on the internet. In sphinx/rst it could be:

Check out :pep:`8`

He becomes sad if he sees documentation written in markdown.

“Read the docs” is build on “sphinx” is build on “docutils” is build on “restructured text”. Here are some of the internal concepts of RST:

  • A reader reads input.

  • A parser takes the input and actually turns it into a “Doctree”. RST is the only parser implemented in docutils. It handles directives, multi-line inputs, etc.

  • The “doctree” is like an AST for docutils. This is the basis for everything else. The tree consists of nodes.

  • Nodes can be structural elements like document, section, sidebar. Body elements like paragraph, image, note. Inline elements like emphasis, strong, subscript.

    The most common types of nodes are “Text nodes”.

  • RST “directives” are the most common extension mechanism. It allows block level extension of RST.

  • You can also use “RST interpreted text roles”. It allows paragraph-level extension of RST. :pep: 8 is an example.

RST is a really neat language. Some directives are tied to RST because of the way it is parsed, which makes it hard to re-use things. We need to think about how to port this to other parses so that we could use/extend markdown and so as we might be too tied to RST.

Once the parser is ready, you can start doing things with it.

  • “Transformers” take the doctree and modify it in place. It allows for full knowledge of the tree, for instance for generating a TOC.
  • “Visitors” allow you to quickly add arbitrary node types. You implement a visitor with a visit_yournodename() and a depart_yournodename() function that outputs content.
  • “Translators” convert the doctree back to an output type. Html, pdf, etcetera.
  • “Writers” are used to actually write the translated items to disk.

So... doctutils READS the document, PARSES it, TRANSFORMS it and TRANSLATES/WRITES it to output.

On top of RST you have sphinx.

  • The sphinx “application” is the cental part that steers the entire process.

  • The sphinx “environment” keeps state for all the files for a project. It is serialzed to disk in-between runs.

    It is cached as pickles between run. This makes re-building the documentation much faster if you’ve only changed one file.

  • “Builders” are wrappers around docutils’ writers. It generates all types of outputs. It generates most HTML output through Jinja templates instead of using Transalators.

Sphinx has lots of events like source-read, doctree-read, env-updated etcetera. If you want to extend sphinx, this is the place to start.

Some extension examples:

  • One of the extensions they made themselves for read the docs is markdown support. They used recommonmark for that. Recommonmark’s node class is mapped to a node class that is understood by sphinx. Markdown files can even co-exist with RST files inside the same set of documents.

    The drawback of markdown is that it lacks ways to extend the language. There is a proposal for markdown inline markup, though. That would make it possible to support more of the RST features in markdown.

  • Table of contents. They implemented a “pending” node that can be filled later in the sphinx rendering process with the actual table of contents.

  • References can refer to anything elsewhere in one of the other pages. A transformation later on in the process resolves the references.

    With the proper setup, you can even reference items in other sphinx documents, like pointing at Django’s documentation on some subject. Google for “intersphinx_mapping” if you need it. It works wonderfully.

Django uses sphinx in a slightly different way.

  • All documentation is written in RST, but the HTML is generated as JSON blobs (!!!). It is rendered through django templates on the website.

  • It has some django-specific additions like directly linking to settings or tickets:


    He showed the implementation of this feature.

Some take-aways:

  • Make sure to use semantic markup when writing docs. You can write down more information about what is going on in your brain.

  • Generally your job is to get the nodes to exist in the way that you want. So when you write extensions, keep the node tree in mind.

    You can run make pseudoxml to get a pseudo-xml output to show you the sphinx node tree view of your document to help you with this.

  • Understand where you need to plug into the pipeline and do as little as possible to make it happen.

vegetation on my model railroad

Image: attempting to get the vegetation colors right on my model railroad. The example photo is from Ulmen, Germany, in november 2000.

water-gerelateerd Python en Django in het hartje van Utrecht!

Django under the hood: Django security - Florian Apolloner


Tags: django, djangocon

(One of my summaries of a talk at the 2015 django under the hood conference).

Florian Apolloner talks about django security. Django security, so it won’t be about attacks against the SSL protocol as those are outside of django. (But do look at

In case you think you’ve found a security bug in django: look at and only contact Don’t report such a bug publicly, as that makes it much harder to make and distribute a proper fix.

Regarding security: look at the owasp top 10 list of the most common found vulnerabilities in websites.

SQL/SMTP/OS injections

Basic rule: don’t ever trust user input. Everything the user can input into your web interface is to be treated as dangerous. select * from auth_user where username=%s is easy to exploit if you inject a username string inputted by the user directly.

If you use django, you don’t run much risk as django does the good thing internally regarding escaping.

Defense in layers is best. If you limit the parameter to an integer in your url patterns and select users by id instead of a string, you already prevent many problems.

The same goes for OS interaction. Use Django components instead of rolling your own storage or email. Django’s components are secure. If there’s no django component to do it, like ldap authentication: watch out as you’re on your own.

Generally: string interpolations are bad.

To repeat: do not ever trust user input. This includes everything the user sends, including http headers and filenames/content types in uploads.

Authentication and session management

Basic rule: use what django provides. Django does a lot to keep you safe. He showed some examples.

Passwords are stored encrypted. Multiple algorithms are available out of the box. Iterations are increased every release. Upgrading to new algorithms are always possible.

Since django 1.9 you can have password validators. Length checks, numeric characters, common words. And you can add your own. Please use them.

Django allows pasword reset links to be used. For this, nothing needs to be stored on the server. A link is mailed to the user. The link can be used once and only once to reset the password.

The link includes a user ID, a timestamp and an HMAC hash of your last login time and two other items. You can look at django.core.signing.* if you need to roll your own.

Cross site scripting (XSS)

A lot of XSS is prevented by django’s auto-escaping. But: it is HTML only. It replaces < > ' " with entities. Always use quotes around attributes.

Javascript requires different escaping! var mystr="{{ value|escapejs}}".

The canonical XSS attack results in something like data="</script><script>alert('xss')'//".

If you want to insert json inside a template:

var json = JSON.parse('{{ data|escapejs }}');

Or you can use django-argonauts:

var json = {{ data|json }};

Defense in depth: you can enable django’s XSS protection. It enables a http header that tells your browser to be extra picky about what it allows: no inline js, no event handlers.

The most important thing: check your libraries and your own code. Many people just do mark_safe(json.dumps(). Many many people.

Cross site request forgery (CSRF)

Basically: <img src="">.

It is enabled by default. It protects your site against an attacker sending one of your users to your site with a hurtful request. Django protects against it by generating a random value in the form and setting one in your cookie. If you go directly to the form and submit it, everything compares. If you come from a different site, they don’t match.

Unvalidated redirects and forwards

This means requests to /auth/login/?next= After login you are on So use django.utils.http.is_safe_url(). It contains more comments than code, which means that it is code that is hard to get right :-)

Security checklist

Run check --deploy and try to get everything right. It checks for security settings that you might have missed.


What more could we do in Django?

  • Rate limiting for login and so.
  • Two factor authentication. (TOTP and U2F as reference implementations).
  • CSRF improvmeents (#16859).
  • JSON filter for templates. This will be added to django core.
  • Enhance SecurityMiddleware. (Look at
  • Implement Content-Security-Policy.
  • Limit POST/GET data length.
Repairing a locomotive

Image: repairing a locomotive

water-gerelateerd Python en Django in het hartje van Utrecht!

Django under the hood: files in Django - James Aylett


Tags: django, djangocon

(One of my summaries of a talk at the 2015 django under the hood conference).

James Aylett talks about files in django. You’ve got Files in python. Django build its own abstraction on top of it. File, ImageFile. Separate ones for use in tests. UploadedFile (“behaves something like a file object”, it mentions in the documentation...). There are temporary file and memory variants. Custom upload handlers. forms.FileField. It might not be perfect, but it works.

Files in the ORM: what gets stored in the database is a path to a file, the file is stored on the filesystem. If you store an ImageField, you can query the width and the height of the image. You’re better off storing the width and height in the database, though, as otherwise the image has to be read from disk on every request.

“Files are stored on the filesystem”? They are stored in settings.MEDIA_ROOT by default. Storing is done by a storage backend. You can replace it to get different behaviour. You can use a different storage backend by configuring it in the settings file. Or you can override it on a field-by-field basis.

Different storage backends? You can store data in amazon S3, for instance.

If you have a reusable app that works with files, please test on both windows and linux. And test with someting remote like working with S3.

Static/media files. Originally, you only had “media” files. Since django 1.3, you also have static files: non-user-uploaded files such as your apps’ javascript/css. Splitting “media files” and “static files” is a good thing.

There’s a complication: CachedStaticfilesStorage (or ManifestStaticfilesStorage in django 1.7). It adds hashes of the files to the filename to allow them to be cached for ever. Great system. Best practice. But it depends on everything using {% staticfiles %} in a very neat way. Otherwise you have cached-forever files that you want to change anyway...

Asset pipelines. Not many people write their css and javascript in one single file. You split it over multiple files. Or you compile coffeescript to javascript. Or you use a program (webpack or browserify for instance) to combine various files in one big one. This is graet for minification and caching. You do probably need “source maps” to help your browser debug tools refer back to the original files.

(Note by Reinout: read for a nice explanation!)

Now... how do you get this into your django template? Either your combiner has to read your html code and write it back again. Or you write custom code to do things like {% asset 'my-js.js' %}.

For an example of an alternative you could look at rails/sprocket. Sprocket manages the entire pipeline and can touch every file and manage everything.

In the node.js world, it is common for the web code not to touch the pipeline. They’re separate. Webpack is an interesting one. Also “gulp” which defines the pipeline in a program. This means it can be customized a lot.

For django, it is good to be compatible with what node.js is doing.

What we’d ideally need:

  • Use a pipeline external to django.
  • Hashes computed by staticfiles.
  • Sourcemap support.

If you want to use webpack, you could look at wabpack-bundle-tracker and django-webpack-loader. The pipeline is run by webpack and it emits a mapping file. There is a template tag to resolve the bundle name to a URL relative to STATIC_ROOT.

Tip: many people know There’s also, which looks at it the other way around. It looks at your website and figures out which thing you’re using that lead to problems in browsers you care about.

Cat licking itself near a water tower

Image: cat cleaning itself on the valve of a water tower, picture of my in-progress ‘Eifelburgenbahn’ layout.

water-gerelateerd Python en Django in het hartje van Utrecht!

Django under the hood: twisted and django - Amber Brown


Tags: django, djangocon

(One of my summaries of a talk at the 2015 django under the hood conference).

Amber Brown is a Twisted core developer. This talk’s summary will be:

>>> Django == good
>>> Twisted == good

And everything will be better if we work together.

Synchronous and asynchronous. Synchronous code is code that returns inline. Asynchronous code is code that returns something possibly at a different time. The extra complication is that IO is often blocking.

Twisted is asynchronous. Regular python code like socket.write() is blocking. Twisted has its own socket that calls python’s behind the scenes. In user code, you should only use twisted’s version: then your code is async and it isn’t blocking on IO.

At the core there’s always something that tries to read/write data. But we normally work at a higher level. So there are protocols that we actually use that are build upon the lower-level read/write connection.

A lot of the async code works with callbacks. You call a function and pass in a second function that gets called with the result when the first function is ready. Twisted uses it often like this:

>>> deferred = Deferred()
>>> deferred.addCallback(lambda t: t+1)
<Deferred at ...>
>>> deferred.addCallback(lambda t: print t)
<Deferred at ...>

They’re trying to use more of the recent python 3 syntax goodness to make working with this easier. Generators, yield from, etcetera.

Now on to django.

Django does blocking IO. Making this asynchronous is hard/impossible. Everything has to cooperate or everything falls apart. It is hard/impossible to “bolt it on” afterwards.

People use to think “sync=easy, async=hard”. That’s not the case, though. Both have their own advantages and drawbacks:

  • Sync code is easy to follow. One thing happens after the other. A drawback is that you can only do one thing at once. Persistent connections are hard.
  • Async code is massively scalable. handling persistent/evented connections is super easy. Python 3 adds syntactic sugar that makes it easier to write. A drawback is that you can get into a “callback hell”. You have to be a good citizen: blocking in the reactor loop is disastrous for performance.

A way of running django is a threaded WSGI runner. Each thread can be blocking on IO, but you have lots of them. You could look at hendrix, a WSGI runner that can run django and which also includes websocket support.

There’s something new for django: django channels. Requests and websockets are now events that can be fed via channels to queues. Workers can grab work from the queue. When ready, the channel feeds it back to something on the other side. It supports websockets.

With django-channels you can use the @consumer('django.wsgi.request') decorator to subscribe to some queue.

It doesn’t really make django code asynchronous. It is “only” a way to use synchronous code in an async way. But that might just be enough! It is a big improvement for django and it is better than the current approach. There are talks of integrating django-channels in django core when it is polished a bit more.

But: adopting an asynchronous framework (=twisted) is a long-term way forward. Otherwise we keep bolting patches on a request/response mechanism that isn’t suited very much to the modern web.

Then we got shown an example of django running with an async ORM and handling requests in an async way. It was a quick hack with lots of bits missing, but it did work. It would probably work very fast on pypy.

Python 3.4 has the “yield from” statement that lets you use return in the function you’re calling. Python 3.5 has even more goodies like await.

Twisted is trying to modernize itself and trying to get more people onboard. A django-style deprecation policy. Removing 2.6 support. Using new python 3.4+ features.

Django should run on twisted!

And what about greenlets? Bad... Just read

simulated television interview

Image: television interview with not-quite-completely-painted scale figures on my in-progress ‘Eifelburgenbahn’ 1:87 railway layout.

water-gerelateerd Python en Django in het hartje van Utrecht!

Django under the hood: keynote - Russell Keith-Magee


Tags: django, djangocon

(One of my summaries of a talk at the 2015 django under the hood conference).

Russell Keith-Magee started by showing a lot of commit messages to show Django’s history. There are weird and humoristic ones in them.

Many having to do with Malcolm Tredennick. Like Malcolm’ insistence on auto-escaping in templates to make it safe. And removal of white space at the end of lines.

Stories about bugs that only surfaced on the first day of the month if UTC had not yet rolled over. And only if the previous month had 31 days... Oh, and a set of commits done by a person that was convicted to community service!

There was a fine collection of weird problems. “Fixed #16809 – Forced MySQL to behave like a database”.

Now we come to the present. There are some technical threats like real-time and async code. Technical challenges can be met. There is a bigger risk, though: the social aspect. The low hanging fruit in django has all been solved. What is left to do is really hard big problems. Often, only core committers do that kind of work. But those are the ones that already do a lot of work. We need new people stepping up. Those need to be mentored. By the same people that already do a lot of work. There are some initial efforts at paying people to work on django, but that’s a topic of an entirely different talk.

Bringing in new people means new people in the community. What is the community like? Can it cope with new people? How is the atmosphere? Do technical debates between old community colleagues flare up into wars? Or do they not? There has been a big debate on the code of conduct with all the expected arguments. In the end the code of conduct is now just accepted practice, but it took quite some work and flak.

Django is an incredible project. With a great community. And it already exists for 10 years! Which is an achievement in itself.

Russell mentored some people, he told an example earlier. Malcolm was called out as someone who especially mentored and welcomed and helped people. Russell asked us to follow Malcolm’s example and to be welcoming and to help people and to share knowledge. Start at this conference!

German railcar leaving tunnel

Image: German ‘Schienenbus’ railcar leaving the Monreal tunnel on my in-progress ‘Eifelburgenbahn’ layout

water-gerelateerd Python en Django in het hartje van Utrecht! logo

About me

My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):