2012-05-23
My django book project just got cancelled by the pragmatic programmers. For a very good and valid reason: it just took too much time and effort to get the chapters completed. Most of that time was spend in editing chapters into a form that fit the pragmatic programmer style.
My main writing problem: getting side-tracked with details. I just try to cram too much into the book. When I can explain something that’s marginally related to the real subject of a sentence/paragraph/chapter, I will. And, boom, another paragraph gets too hard to understand.
The current result? I’ve got 140 pages out of an estimated 250. That’s a hefty pile of paper when I print it out. I’ve had colleagues look in awe :-) The text is not at the pragprog quality level, but I’d say that a lot is quite OK.
So... what now? I don’t know yet. I don’t want the effort to go to waste. I might copy stuff into my blog. I might continue writing by myself. Self-publish.
For that, one of the first things I’m doing is converting my text from pragprog’s format to either LaTeX or restructured text. And changing all those Ugly Headers With Title Case to regular sentence case :-)
But first I’m taking some time off. Time to read a book. Time to fix up the house after last summer’s move.
Wow, I certainly put in a massive amount of time... But also my editor and my co-author. Thanks a lot for their effort!
This might take some getting used to.

2012-05-22
I like cycling a lot. I cycle to my work, for instance.
I also like watching cycling on television. This year I watched most spring classics (Flanders, Amstel gold race, that sort of stuff) and I’m paying attention to the Giro now. And managed to find a youtube video of the last hour of stage 7 of the tour of California stage up Mount Baldy. I watched the last 7 km or so of that stage where Dutchman Robert Gesink won :-)
Last thursday I did a short ride in Nieuwegein with my family to an ice cream shop. And got stopped at a through road because of... a passing cycling race. Olympia’s tour of the Netherlands. Always funny to see a 100-something cyclists racing past. So much speed.


2012-05-21
Last week we had an internal sprint week at Nelen & Schuurmans to do lots of work on Lizard (http://lizard.org).
A django-technical characteristic of lizard is that it is layered:
The big change this sprint was in lizard-ui:
These two changes also meant we had to update many apps to actually use the class based view and to simplify their templates.
Examples of other changes we made this week:
We also experimented with javascript graphs instead of our regular matplotlib ones, but that wasn’t finished yet.
Not everything’s right yet. Some functionality is broken and needs fixing. Here’s part of our TODO list:
So... onwards to fix the issues and we ought to have a great new layout to preview on next month’s Dutch lizard day.

2012-05-15
In rare cases, you can import a file that imports the file you’re importing from. This might sound a bit recursive, and it is. In Python it is called a circular import error.
The best way to recognize a circular import that goes wrong is to look at your results. You will get an ImportError message from Django, like cannot import some_view from your_project.views. When you look in your_project/views.py you see that some_view really exists in that very same file. Huh? That wasn’t expected.
This huh??? is exactly what a circular import looks like. Something that exists (you checked it five times at least) doesn’t seem to exist. A computer doesn’t lie, but you start to wonder.
The problem is, Python cannot complete the import, so it complains about the point where it goes wrong, even though the actual error is the circular import loop as a whole.
The solution is to break the loop somewhere. Perhaps the thing you want to import is best placed somewhere else? Most often, a circular import error indicates an organization problem in your code.
2012-05-11
His talk is titled “space invaders, relational modeling and domain models, a mashup” instead of just the plain sqlalchemy that I expected :-)
He started his computer career with a TRS-80 in 1980. The killer app at that time (5th grade school) was printing out a picture of Snoopy on the matrix printer.
Next up an Atari 800. There was a great ‘vertical scroller’ game and he wanted to build his own. But you needed to write assembly to pull it off which was way over his head. Machine level programming is depressing.
Somewhere in the 90 was, for him, the time of architecture astronauts and enterprise java beans. 1200 database tables for reading a big SGML document, that sort of nightmare. He was made to use the Pure waterfall method. Spending a month only drawing Rational Rose UML diagrams. Also depressing.
But he did finally understand objects and liked them. A domain model is important.
He showed wordpress’ comment edit code. PHP code; hacked often. Hardcoded ad-hoc SQL code to delete comments, embedded right in the PHP code. Comments are only an integer ID, not real comment objects, of course. In the end, he showed the entire PHP page in 3px font size: HTML, SQL and PHP all mixed up in one file. Horrid.
You basically give an integer ID through a couple of calls all the way through the application. Only valid for small scripts, not really for a system the size of wordpress. Can we call it the water slide model? One straight flow down from an ID to some action.
A domain model is “a conceptual model of a domain of interest”, according to wikipedia. You model the problem, agnostic of implementation details. And we create explicit adaptions between the outside world and the model (adapters, serializers, views, etc: separation of concerns). And ideally we only use neat normalized data.
To him, dealing with such a domain model feels a bit like a horizontal pipeline. (Reinout: not sure what he means with this).
At the core of relational databases is the relational model. Information represented as collections of rows, consisting of columns. Mostly ACID.
Essential to relational databases are joins: intersecting or joining data. SQL, the language, is a declarative language (as opposed to imperative or functional, see stackoverflow).
How do we reconcile the SQL/relational model with the domain model? He things an ORM, object relational mapper, is a good idea. An ORM needs to map objects and collections to the tables, rows and columns of the database, including relations.
Keeping the “R is for relation” intact is a real challenge. There are several ORMs that don’t expose enough of the available relations. Sqlalchemy does expose it, you can keep thinking in terms of your database and its tables and relations that you know.
Another challenge is to limit the amount of queries. For instance, when you iterate over some set of database objects (rows) in Python, there is a big risk that you do a follow-up query for every row. So you need to be able to give the ORM some instructions for loading that extra related data directly. Sqlalchemy helps you with this: this way you get the luxury of a nice ORM, but you still have well-optimized SQL.
Mashup? Yep, he wrote a space invaders clone that stores all the game data in an SQL database (sqlite, in-memory)! Idiotic example, but loads of fun.
The domain model includes missile, player, splat, enemy, army and saucer glyphs. Every glyph has a coordinate.
The code has a main loop with update_state() that moves the missiles, player, etcetera and draw() that draws everything. Both query the database! Are there coordination overlaps between missiles and a saucer? Boom! You can finally get an explosion out of a database query. And drawing means querying for all coordiates and rendering what’s there. Wow.
Losing is also determined with an SQL query: an enemy coordinate matches the player coordinate. Or an enemy coordinate matches the bottom row coordinate :-)
Then he started playing the game in one screen with a log file showing all the SQL queries live in another screen. Fun!

2012-05-11
I also gave a talk at http://pygrunn.nl on development laptop setup: explicit automation. I don’t have a summary of my own talk, but I have two other things for you:
Later on I’ll try to document my dotfiles, but you can already find a lot of ideas and tips online for bash settings, for instance.
2012-05-11
He had to show lots of markers on a google map, which often led to bad performance. And he wanted real-time updates. And... the client needs to be responsive!
A marker is a DOM element, but also a shadow, a click area, an icon image and so on. Which takes a lot of memory and rendering it takes lots of CPU.
How to optimize? One option is to reuse an existing marker image if they are the same. If they are different, using an image sprite helps a lot. You have to set offsets and sizes on the markers to get the sprite to work.
The lazy way to draw lots of markers is to just draw all of them, regardless of the viewpoint. Most of them will probably not even be visible... Looping over so many DOM elements is slow.
There are some weapons you can use to speed it up: the idle (zooming/panning is done) and zoom_changed google map events. For this, underscore.js helps with functions like defer and with handy collection types. Similarly backbone.js helps maintaining collections and binding them to events and to an API on the server.
Backbone trick. Backbone has no real handy way to remove items from a collection that are old, you can only add or reset a collection completely. He did it by adding a version number to the data he fetched from the server. If, after an update, items have a lower version number, they’re old and you can delete them. This way he could refresh a backbone collection without needing to zap it first and rebuilding it from scratch.
Now, what about the webserver? It should be simple, fast and scalable for his usecase. He went for tornado. Tornado is quick. And it simplifies asynchonous code.
The database should be JSON friendly, support geo queries and it should have an async Python binding. MongoDB has it. You can do queries like {'loc': {'$within': {'$box': box}, }}. For the Python binding he used pymongo.
He used google’s fusion tables for zoomed out levels: they group markers together so that you only see an indication of the number of items instead of showing all the markers. (Reinout: but somehow he didn’t do this, at least I didn’t see it in his demo. I probably misunderstood something.)

2012-05-11
In the GIS world, everything used to be proprietary (ESRI, Oracle), but there is a lot of commoditization in the last years. Lots of open source. One of those open source pieces is geodjango.
Geodjango is bundled with Django, but out of the box you miss a couple of pieces. You need to install a couple of extra libraries (gdal, geos, proj) and you need a geospatial database (postgis, oracle, mysql, spatialite).
Ivor guided us through a sample application. Things like setting a gis database instead of a standard one. Adding django.contrib.gis to the INSTALLED_APPS setting. And special geometry fields for points, lines, polygons. Using a specific OSMGeoAdmin for showing a map in the admin interface for those geo fields.
A limitation in geodjango is that it doesn’t give you regular form fields for the geo fields. They work in the admin, but not in regular forms. Luckily django-floppyforms does provide them, so he used floppyforms to get nice forms including a map in his regular web interface. Creating geojson from database content and show that in the map.
(Note to self: look at proj4js).
Geodjango is well-integrated into Django, but you do need to use the geodjango variants of fields, databases, admins. “You need to prefix the stuff”. You get a lot out of the box, but there’s quite a learning curve. You also need to learn quite some javascript for the user interface.

2012-05-11
According to Armin Ronacher most (Python) web frameworks use a request/response style of handling HTTP. At his company, they’re treating HTTP a litle bit different. (So the talk is first about some HTTP-usage-in-Python observations and second a look at the alternative way they’re treating HTTP).
Note: my brother has a clearer summary, btw.
The most low-level way is to write directly to the response. Write the response headers, write the actual response content. In Python, you often have some response object; often some sort of middleware gets the chance to do something to the response on the way out.
The nice things we like about HTTP:
A basic question you should ask yourself is why does my application look like HTTP? A common Django application gets a request, does something and returns a response. Works well. But why is it set up that way? Why is it so focused on HTTP? (It is logical that it focuses on this use case, but you can still ask the question).
HTTP can be a stream or buffered. Sending stuff from the server to your browser is a stream. But often an incoming request in a Python web framework is first buffered internally (memory or disk). In the same way a request is a bit of a strange mix:
On the client (like your webbrowser) you cannot do anything to an incoming request, once it started, is to close the connection. You cannot interact anymore once you received your first incoming byte.
A consequence of the buffering and the way HTTP is handled is that you can have problems accepting data. How big a file should you accept? How big an incoming form? Buffer it in memory? Or on disk? And how do you handle streaming? You might be streaming in one part of your code, but how do the other layers handle it?
Internally in his company, he’s trying to handle HTTP differently. There’s no direct HTTP contact in most of the code base. Everything that eventually ends up in the HTTP layer is implemented as some sort of “type object”. This allowed them to really flexible in the HTTP layer. Support for different input/output format. Easier to test. Documentation can be auto-generated. Lots of common errors can be catched early.
A basic rule is to be strict in what you send, but generous in what you receive. But web Python code is often generous by “just” accepting a lot without much checking. That might be a security risk. In Armin’s system, you know what type should be coming in, so you can do proper checking.
How does this deal with the big-upload problem? Incoming streaming data? Well, because of the type system, you actually know which types need a streaming API. This makes it easy to set up your API correctly. You can even selectively use a different protocol than HTTP.
2012-05-11
django-crispy-forms is a Django application, but Miguel thinks the talk will help you also with designing other systems and applications.
Django has three ways to render forms: as_ul, as_p, as_table. They do the same, but render themselves in a different way. Common questions by people new to Django is “what about divs?” and “how to reorder fields?”. For the last one, you need to switch the order of the form fields in your python form code. There are some other tricks like overriding the self.fields.keyOrder attribute. If you have many fields, regular list methods like delete() and pop() and insert() might help you.
But... ModelForms for the admin interface are different again: the abovementioned form tricks don’t work there. And those tricks sound a bit dirty anyway. So: how to customize the form output?
You can do a lot in by customizing the Django form in the template, but most of it will be hardcoded and hand-tweaked that way. And if you customize a form, you’ll often forget form.errors and form.non_field_errors, for instance.
Django-crispy-forms was formerly known as django-uni-forms. It was created by pydanny in 2008, Miguel is now one of the main committers.
Crispy forms work on forms, modelforms and on formsets. A |crispy filter in the template renders your form as handy divs with better classes and IDs which helps a lot if you want to customize your form with css. Neat!
Crispy also has a {% crispy %} template tag. You can pass it a “form helper”. A form helper is a global helper: it is decoupled from forms. So it normally works with any form. There are some attributes on the form helper that you can set, like method (get/post), form_id, things like that.
You can customize such a helper specifically for one form and set the order of the fields, for instance. There are so many things you might want to customize; crispy supports/allows most/all of them with a Layout class and other layout classes like Div. You can get really deep into the machinery by letting crispy inject Django template code directly into the template...
For the ultimate in customizability, you can write your own layout class that renders itself in whatever way you want. Layouts can be nested, so there is a lot of flexibility here. You can also customize crispy’s own templates that it uses for fields and forms using the regular Django overwrite-a-template mechanism.
Handy: crispy forms has specific support for twitter bootstrap. This helps you get a nice looking form.

Statistics: charts of posts per year and per month.
My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):