On 2009-12-17 I gave a talk at the Dutch python user group (“PUN”) about managing dependencies. Here’s my own summary :-)
If you have to maintain a python project of a reasonable scale, it often pays off to split it up into separate packages. And even if you keep it all in one package, you’re bound to use external packages. The python standard library is BIG, but you want some of those tasty goodies from pypi, the PYthon Package Index.
Pypi is great. I can just tell people to do
and they’ll have a properly installed zest.releaser with a couple of scripts
all ready to use. No extra steps needed. Pure luxury.
Who reads documentation? Who reads installation instructions? Well, I’m civilized and well-behaved, so I do. But I also expect my packages to be civilized and well-behaved. So if I install something and Something requires Something Else, I expect that to be installed automatically. Debian/Ubuntu have spoiled us too much with proper package management, right?
This means specifying dependencies: “I need those thingies, too”. Well, to make your package available on pypi and to make it easy_installable, you need to properly package it up.
Dependencies: fill in that
setup.py’s dependency section:
If you now install your package, you’ll automatically get setuptools (which you normally already have) and PasteScript in this example.
One thing to look out for: sometimes/often your tests need more
infrastructure. Require them as test dependencies instead of regular
dependencies. Due to some unfortunate setuptools implementation detail, the
tests_require option is not available for other tools, so for
instance the zope world now uses an
[test] for the same
thing. Following some discussion I’ve recently switched to doing something
like this in my setup.py files:
tests_require = ['zope.testing',
In case you heard of both
distribute: distribute fully
replaces setuptools. Just use distribute. Setuptools is “maintained” (for
various historically dubious values of “maintain”) by one person (whom all
should applaud for creating the darn thing in the first place, btw!).
Distribute is maintained by a lot of people, so bugs actually get fixed. And
“bugs” meaning “it doesn’t break with subversion 1.6 because patches that fix
it don’t get applied for half a year”. Be sure to use the latest versions of
distribute (and buildout, if applicable).
Automatically installing dependencies, where does that happen? Well, normally in your system python. But what about conflicts? Well, those aren’t solved. Everything is “just” getting installed. You’re right in not wanting that. You’re right that it can wreak havoc beyond your wildest dreams.
So you need to isolate it a bit. Have an isolated environment for every project. And you need version handling to keep it a bit reliable which versions of the packages you install. You’ve got three options:
Virtualenv (for isolation) in combination with Pip (for “defining fixed sets of requirements and reliably reproducing a set of packages” as mentioned on the pip website). I’ve used virtualenv a lot, but not Pip. I trust it to work just fine (though they don’t guarantee it works on windows: in any case it won’t support windows binary eggs, so it isn’t a feasible option for some projects).
Buildout for both isolation and package version handling. Reproducible results are a key word. I’ve just moved four complex web sites with tens of dependencies from one server to another by just doing a checkout of a buildout, running it and ready. That’s the reliability you get with a proper buildout! I’ll concentrate on buildout for the rest of this article.
You can prepare, for instance, an apache directory with just the eggs and source tarballs that you want in just the right versions and point virtualenv/buildout and pip/easy_install at just that directory instead of at the entire PYPI website. Yep, that’s a quite neat way to get reliability, too. Great if it fits your deployment style. To be honest, I haven’t really tried it out as I suspected it was too much work and buildout worked fine for me…
Buildout uses setuptools to recursively install all needed dependencies. Those dependencies are installed only within the controlled buildout environment (which means: the directory your buildout config file is placed in).
In your buildout, you can (and in my opinion must) have a
part that lists some (or all) of package’s version. A highly recommended
helper here is the buildout.dumppickedversions extension. This prints a
list of packages for which buildout automatically picked a version. Which
means that you didn’t explicitly pick a version. Which means that it can be
different next time when you install your customer’s live server! Here’s an
example snippet from a buildout config:
versions = versions
extensions = buildout.dumppickedversions
# Unpin the one package you're developing in this buildout
# Specific versions you want to have
lxml = 2.2.2
# Dumppickedversions reported these
z3c.recipe.compattest = 0.12
zc.buildout = 1.4.3
Such a buildout ought to keep you free from most surprises. Surprises are good on your birthday, but not on the day your customer’s website goes live.
If you’re doing a project that depends on a big framework like zope, grok or plone,
your versions list can have 30-50 packages. That’s not fun. Luckily those
projects have so-called KGS configs ready for you to use. “Know Good Set”.
That’s a ready-made
[versions] list you can include (well, you do an
extend=http://link/to/that/kgs.cfg). Google for it.
The bigger your project, the more dependencies. But watch out, keep a close eye on those dependencies! They may automatically pull in extra dependencies, which is ok. But you can implicitly start to depend on them (by importing functionality) without explicitly stating that they’re a dependency.
zope.component depends on zope.interface. So if you require
zope.component, you’ll get zope.interface, too. So all your
zope.interface import Interface imports will work fine. But if the
dilligent zope folks manage to cut some dependencies, you might lose that
implicit zope.interface that “always used to work just fine”.
So in the fine python tradition of “explicit is better than implicit”: what
you import from must be an explicit dependency. So an
zest.releaser means an
install_requires=['zest.releaser'] in your
setup.py. That’s an order!
While you’re at it: take a good look at the version numbers. If you know you
need at least version 1.3, be a good boy or girl and make that
install_requires=['zest.releaser >= 1.3']. That way you get an explicit
error if someone tries to make do with a 1.2.
If you import it, a tool can discover you import it. And it can warn you if you don’t depend on it in your setup.py. Tadaah: z3c.dependencychecker, a handy script that tells you which dependencies you miss and which ones are unneeded. See my introductory blog entry about it.
And if your list of dependencies is growing a bit long, perhaps you should split out some functionality into separate more well-delimited libraries. Or you should look at your dependencies again: perhaps you use too much details of your framework where using several higher-level functions will do just as well. Perhaps not, but at least look at it. Explicitness helps!
With your dependencies well in hand, you of course want to make sure they’re solid. So you want to test them. I’ve made an addition to z3c.recipe.compattest (see my introductory blog entry about my addition) to make that extra super-de-luxe easy-as-pie. Just include something like the following in your buildout config:
recipe = z3c.recipe.compattest
include-dependencies = my.package
This will create a
bin/compattest script that tests your package and all
its dependencies. Pure luxury, I tell you.
My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):