Pixelise: XML document publishing (Zeth)¶

Tags: europython2009, europython, django

A talk by Zeth about publishing xml documents with python, django and the berkeley xml db: pixelise (named after a cat, just so you know).

Some people say “you can’t publish xml without breaking it up into an sql database”. He says: just store it in a native xml database and be done with it. (Zeth gets the top prize for explaining xml in 8 slides by using a cat and a budgie: Zeth, put those slides online!)

Pixelize depends on django (you just need to be able to get it running) and the berkeley xml database (just install it). A django manage.py startxml demoproject creates a django application for you with some initial settings and an empty xml database.

With xml files, the way you setup your xml database indexes is vitally important for the speed of the application. He’s added a utility method that just adds indexes for every element and attribute as that’s a sane starting point for most people.

A separate script adds an xml file to your xml database. And that’s the last you have to do with the xml file. It is now in your database and you can access it as a python object. And query it with xpath expressions, for instance.

When you want to, you can treat xml as a stream. Normally that means you almost have to mimic a SAX-style parser which isn’t nice. He added something he calls “xml processors” that ought to make that easier.

According to Zeth, this is a very handy way to deal with xml. And you can use the xml files in a handy way in django that way. See an example site