Django and pypy - Alex Gaynor (djangocon.eu)

Tags: djangocon, django

Alex Gaynor tells us about pypy, the fastest python available. Alex is both a django and a pypy core developer.

Alex likes making code fast. And making EVERYONE’s code faster by writing a faster python is best. The “regular” python that most people see is actually cpython. There are more: jython and ironpython and … so there are more.

Pypy is python 2.7.1 written in python. It is a successor to psycho, optimizing things that psycho itself couldn’t do as it relied on cpython. Pypy is fast. At least twice as fast for all-but-one benchmark. Sometimes 30 times as fast. See http://speed.pypy.org . And see the django benchmark.

Pypy works as a JIT (just in time) compiler. It finds small bits of code that are often executed and optimized them into assembly.

With a speed increase like this, why isn’t everyone using pypy? The main reason: c extensions. It are libraries that are tightly coupled to the python C api which is full of python c internals. Some solutions:

  • CPyExt. For instance PIL compiles under this. But it is slow and it only works with very well-behaved C extensions.

  • Pure python / ctypes. Many C extensions are there for speed reasons. Pypy removes most of this need: the pure python version often runs just as fast. Alternatively you can use ctypes (part of the standard library). It helps connecting to regular shared libraries. Ctypes can be slow under cpython, of all places, but for pypy it is no real problem.

  • Cython. It can also be slow. There’s a google summer of code project to make it faster for pypy.

The second reason is that for instance web apps are percieved to be IO bound, so a faster python doesn’t matter. In reality, the CPU time is still significant. For instance, people are looking at replacing django’s template language with jinja2 to get more speed! So speed is still important.

How to use pypy? Well, just use it. manage.py runserver will work. For server side work, take any of the pure python wsgi servers. So not mod_wsgi as it is a big C extension, but use gunicorn for instance.

You can use any database you like, but the adapter must work with both django and pypy. Sqlite works fine, as the driver is in the standard library. Postgresql is harder, which is bad as postgres is the best available database. You currently have to compile your own pypy and use RPython’s psycopg2. Mysql is very hard. For oracle there’s an RPython cx_oracle, but Alex doesn’t know anything about it.

Some libraries:

  • PIL works under CPyExt.

  • lxml doesn’t work. Big c-api extension. Could be a good GSOC project to get this to run.

  • Others? Well, there are a lot. Talk to Alex.

Regarding memory, pypy is a mixed bag. Some apps use more, some use less. So benchmark your own app.

If you want your app to run faster: talk to Alex Gaynor!

 
vanrees.org logo

About me

My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):