Every three months, there’s a python usergroup meeting somewhere in the Netherlands. I regularly make a summary of it and this is one of those times. This was also one of the times where there were foreigners present (2 from France, 1 from Australia), so we stuck to English. So if you’re in the Netherlands and you don’t understand Dutch: do come over next time, we’ll probably accommodate you!
See also Maurits’ summary.
Phatch is a photo batch processor. Stani’s background is architecture. The core of architecture is to unite function, beauty and construction. For his software he tries to unite function, sexy and stable.
There are a lot of photo and image applications, but not many that handle batch jobs: doing specific tasks on many images at the same time. For this you tie “action lists” together. Actions can be adding a shadow, rounding colors, etcetera.
In the newest version there are even pre-defined common sets of actions (“web 2.0 icon”). The actions also work on the metadata (exif and iptc tags), so you can have an action that changes all tags “artists” into “artist”. There’s also a server side option for running it without a GUI.
The advantage is that you don’t have to know anything about image manipulation: you just select basic actions that are easy to understand.
Also an option: turn a set of actions into a “droplet”: an application on your desktop that you can dump images on. Handy for common tasks you do a lot like resizing images and putting your copyright notice on it before posting it to your website.
Stani showed a comparison matrix with imagemagick and gimp. Phatch beats both in amount of possible actions as it uses a lot of open source tools as backend. Gimp and imagemagick for instance do nothing with EXIF tags: phatch does. Phatch of course itself uses imagemagick for the tasks that imagemagick does best. There are also blender file plugins for rendering images onto shapes (computer monitor, can of beer, cd booklet). And you can execute external commands that do an action on the image, too, for even more possibilities. And certain actions can be written in python.
Phatch is build with cross-platform libraries and tools, so it runs on linux, osx and windows. For both osx and windows he needs people to package it up for him as he himself only uses linux. 3 to 6 people are currently working on phatch. One of the things they’re focusing on is to make phatch workable as a library, so a bit of an elaborate PIL library. One of the aims is django integration.
Something to look at: phraymd for extracting metadata from collections of photos and managing them based on that metadata (mentioned as someone asked about such functionality).
MonetDB is developed by CWI (“where python originated”). It is a platform for scientific research on databases. Open source, BSD licensed. It is definitively not mainstream.
The aim is Very Big Databases. One of the things they use it for is a 6 TB database (“skyserver”) on star data. Complex data models, so they go beyond SQL. For certain applications it outperforms other DBs by a 100x. Very specific applications.
In the core, MonetDB is a column based storage database (a so-called BAT, binary association table: everything is stored separately). The kernel is flexible and easy to extend. The kernel is small and basic. SQL is one of the available front ends. Xquery is one of the other front ends: one of the use cases is to store XML in MonetDB.
Gijs helped with (or rewrote completely) the python API to MonetDB. Nicely pythonic, of course. python.mapi talks MonetDB’s wire protocol. It is text, not binary, so it is kinda slow. The python binding has some basic functionality like setting up the connection, parsing the response, etcetera. python.sql is a mapping between python.mapi and the standard python database API.
They looked at python 3.x compatibility. One of the big problems is that
python’s DB API 2.0 isn’t compatible yet with python 3.x due to
missing/changed exceptions. Another big python 3 change is the
string/bytes/unicode change. In the end they went for two separate classes in
the one remaining problem spot they couldn’t solve: one that handles python2
and one for python3. That was the cleanest way. There are two ugly if
sys.version_info[0] == 3
tests in their current code to allow different
behaviour for python 2 and 3: so such a hack was only needed twice.
Regarding the python2/python3 “print versus print()” difference: just use the logging module!
Some promotion he wanted to make: http://pythonic.nl. Non-profit amateur web hosting. A focus on python/django. A small group, maximum flexibility. They need users.
Question: “how does it relate to, for instance, google’s couch db approach?” The big difference is that MonetDB is quite monolithic and it has a schema. CouchDB is distributed and, in the core, it does not have a schema.
Comment he made: if you don’t know what you want/need to do, it is hard to write tests. My counter comment: then how do you know what to program? Use doctests to fuel/steer your programming.
PyCuda is a technique to do calculations on your GPU: your video card! The newest video card GPUs have a rediculous amount of performance (in GFlops) for the price you have to pay for it. Something 1000 GFlop for 300 Euro (GPU) versus 120 for 500 Euros (regular CPU). (If I copied his numbers correctly)
There are a couple of libraries, but every one (amongst them Cuda) only support specific types of cards and not all cards are supported. It is very useful for parallel algorithms that are calculation-intensive but relatively simple.
Cuda uses a C++ dialect that’s horrible. There’s a python wrapper for it (PyCuda) that’s much nicer. Jasper hopes that OpenCL (an open standard) wins from Cuda in the future.
See Jasper’s slides.
Subversion uses one central location to store all versions. Mercurial/git/bzr can use multiple repositories, so you can also work locally. And they’re generally much better at merging.
Git has github.com, mercurial has bitbucket.com: websites for social networking combined with code sharing with those systems. (He didn’t mention launchpad.com, which is something similar for bzr, weirdly).
Everybody got to mention useful libraries.
greenlets: cool microthreads
zc.buildout: very powerful tool for managing your project
pysvn: python svn bindings
virtualenv: isolated environment for your project
paste: the wsgi library, an easy way of baking your own web program
webzeug: a nice alternative to paste
repoze.who and repoze.what: easy and lightweight authentication/authorization middleware for wsgi
fabric: for running things remotely on a server
grok: cool web framework
scrapy: web scrapers
lamson: scraping for email
pep8: pep8 code style checking
Last week there was a grok sprint in Cologne. Grok is a web framework, based on zope. Grok takes the pain out of using zope. If you used zope(3), you’ll have seen the layers of configuration files and glue code. Grok uses “martian” which does most of the registration automatically without needing all those horrible layers.
When you “grok” a concept, you really understand it. Grok-the-code understands your code that well that it is able to configure your code. No more writing 10 files to accomplish a single task.
Grok 1.0 will be out this week (or next week). Stable enterprise level stuff, several companies use it in production without a problem. There’s paste behind the scenes for the wsgi work.
Grok isn’t simple-as-such as it uses all the power of zope, but it does make it easy to use.
One comment by Sylvain: you don’t have to use zope-the-webserver. You can also use zope’s component architecture (through grok) for GUI applications.
The component architecture solves the basic problem of making your code base pluggable and malleable and configurable.
Tim, like everyone, makes mistakes. So he needs to do debugging. And he noticed that he did manage to find the bugs, but he always forgot about the bug-finding-approach afterwards. So he never improved. That’s why he’s giving this talk to get input.
One of the first approaches it to use print
. But in a wsgi environment,
output to stdout isn’t accepted. So it doesn’t work there.
(i)pdb is great. But pretty often, a problem only occurs on the 99th time you go through a loop. Yuck.
Input from the room:
For django, you have the debug toolbar. It shows a nice bar on top of your application that you can open for common debug info like the recent logging messages. Nice in some instances, but it fails for json output as it isn’t html…
Ian Bicking’s wsgi post mortem debugger. It fires only when there’s an exception. It drops you directly on the offending line of code exactly on the moment it goes wrong.
Other things to investigate: twill, zc.testbrowser (simulates browser requests in your tests), selenium.
Webzeug apparently also has such a post mortem debugger. Apparently something similar can also be enabled in django. Basically, the exception page is replaced by webzeug’s page that allows variable inspection and in-browser python interaction.
A couple of people vehemently mentioned eclipse’s “pydev” plugin as being the ultimate solution for all debugging woes :-)
I’ll just point to my brother’s full version on his weblog to save me some typing :-)
Jasper Spaans mentioned lamson that is supposed to solve all these issues in a better way than python’s build-in email module.
My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):