Managing dependencies

Tags: python, buildout, pun

On 2009-12-17 I gave a talk at the Dutch python user group (“PUN”) about managing dependencies. Here’s my own summary :-)

Setuptools, eggs, dependencies

If you have to maintain a python project of a reasonable scale, it often pays off to split it up into separate packages. And even if you keep it all in one package, you’re bound to use external packages. The python standard library is BIG, but you want some of those tasty goodies from pypi, the PYthon Package Index.

Pypi is great. I can just tell people to do easy_install zest.releaser and they’ll have a properly installed zest.releaser with a couple of scripts all ready to use. No extra steps needed. Pure luxury.

Who reads documentation? Who reads installation instructions? Well, I’m civilized and well-behaved, so I do. But I also expect my packages to be civilized and well-behaved. So if I install something and Something requires Something Else, I expect that to be installed automatically. Debian/Ubuntu have spoiled us too much with proper package management, right?

This means specifying dependencies: “I need those thingies, too”. Well, to make your package available on pypi and to make it easy_installable, you need to properly package it up.

  • Use pastescript (and for instance ZopeSkel) to create your package’s structure. You get the basic structure right that way, including a nice setup.py.

  • Dependencies: fill in that setup.py’s dependency section:

    ...
    install_requires=[
        'setuptools',
        'PasteScript',
    ],
    ...
    

If you now install your package, you’ll automatically get setuptools (which you normally already have) and PasteScript in this example.

One thing to look out for: sometimes/often your tests need more infrastructure. Require them as test dependencies instead of regular dependencies. Due to some unfortunate setuptools implementation detail, the “normal” tests_require option is not available for other tools, so for instance the zope world now uses an extras_require [test] for the same thing. Following some discussion I’ve recently switched to doing something like this in my setup.py files:

tests_require = ['zope.testing',
                 'something.else']

setup(
    ...
    install_requires=['another.package'],
    tests_require=tests_require,
    extras_require={'test': tests_require},
    ...
    )

In case you heard of both setuptools and distribute: distribute fully replaces setuptools. Just use distribute. Setuptools is “maintained” (for various historically dubious values of “maintain”) by one person (whom all should applaud for creating the darn thing in the first place, btw!). Distribute is maintained by a lot of people, so bugs actually get fixed. And “bugs” meaning “it doesn’t break with subversion 1.6 because patches that fix it don’t get applied for half a year”. Be sure to use the latest versions of distribute (and buildout, if applicable).

Managing your dependencies: isolation, version handling

Automatically installing dependencies, where does that happen? Well, normally in your system python. But what about conflicts? Well, those aren’t solved. Everything is “just” getting installed. You’re right in not wanting that. You’re right that it can wreak havoc beyond your wildest dreams.

So you need to isolate it a bit. Have an isolated environment for every project. And you need version handling to keep it a bit reliable which versions of the packages you install. You’ve got three options:

  • Virtualenv (for isolation) in combination with Pip (for “defining fixed sets of requirements and reliably reproducing a set of packages” as mentioned on the pip website). I’ve used virtualenv a lot, but not Pip. I trust it to work just fine (though they don’t guarantee it works on windows: in any case it won’t support windows binary eggs, so it isn’t a feasible option for some projects).

  • Buildout for both isolation and package version handling. Reproducible results are a key word. I’ve just moved four complex web sites with tens of dependencies from one server to another by just doing a checkout of a buildout, running it and ready. That’s the reliability you get with a proper buildout! I’ll concentrate on buildout for the rest of this article.

  • You can prepare, for instance, an apache directory with just the eggs and source tarballs that you want in just the right versions and point virtualenv/buildout and pip/easy_install at just that directory instead of at the entire PYPI website. Yep, that’s a quite neat way to get reliability, too. Great if it fits your deployment style. To be honest, I haven’t really tried it out as I suspected it was too much work and buildout worked fine for me…

Buildout uses setuptools to recursively install all needed dependencies. Those dependencies are installed only within the controlled buildout environment (which means: the directory your buildout config file is placed in).

In your buildout, you can (and in my opinion must) have a [version] part that lists some (or all) of package’s version. A highly recommended helper here is the buildout.dumppickedversions extension. This prints a list of packages for which buildout automatically picked a version. Which means that you didn’t explicitly pick a version. Which means that it can be different next time when you install your customer’s live server! Here’s an example snippet from a buildout config:

[buildout]
...
versions = versions
extensions = buildout.dumppickedversions
...

[versions]
# Unpin the one package you're developing in this buildout
your.development.product =
# Specific versions you want to have
lxml = 2.2.2
# Dumppickedversions reported these
z3c.recipe.compattest = 0.12
zc.buildout = 1.4.3
...

Such a buildout ought to keep you free from most surprises. Surprises are good on your birthday, but not on the day your customer’s website goes live.

If you’re doing a project that depends on a big framework like zope, grok or plone, your versions list can have 30-50 packages. That’s not fun. Luckily those projects have so-called KGS configs ready for you to use. “Know Good Set”. That’s a ready-made [versions] list you can include (well, you do an extend=http://link/to/that/kgs.cfg). Google for it.

Manage your dependencies: which ones?

The bigger your project, the more dependencies. But watch out, keep a close eye on those dependencies! They may automatically pull in extra dependencies, which is ok. But you can implicitly start to depend on them (by importing functionality) without explicitly stating that they’re a dependency.

zope.component depends on zope.interface. So if you require zope.component, you’ll get zope.interface, too. So all your from zope.interface import Interface imports will work fine. But if the dilligent zope folks manage to cut some dependencies, you might lose that implicit zope.interface that “always used to work just fine”.

So in the fine python tradition of “explicit is better than implicit”: what you import from must be an explicit dependency. So an import zest.releaser means an install_requires=['zest.releaser'] in your setup.py. That’s an order!

While you’re at it: take a good look at the version numbers. If you know you need at least version 1.3, be a good boy or girl and make that install_requires=['zest.releaser >= 1.3']. That way you get an explicit error if someone tries to make do with a 1.2.

If you import it, a tool can discover you import it. And it can warn you if you don’t depend on it in your setup.py. Tadaah: z3c.dependencychecker, a handy script that tells you which dependencies you miss and which ones are unneeded. See my introductory blog entry about it.

And if your list of dependencies is growing a bit long, perhaps you should split out some functionality into separate more well-delimited libraries. Or you should look at your dependencies again: perhaps you use too much details of your framework where using several higher-level functions will do just as well. Perhaps not, but at least look at it. Explicitness helps!

Test your dependencies

With your dependencies well in hand, you of course want to make sure they’re solid. So you want to test them. I’ve made an addition to z3c.recipe.compattest (see my introductory blog entry about my addition) to make that extra super-de-luxe easy-as-pie. Just include something like the following in your buildout config:

[compattest]
recipe = z3c.recipe.compattest
include-dependencies = my.package

This will create a bin/compattest script that tests your package and all its dependencies. Pure luxury, I tell you.

 
vanrees.org logo

About me

My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):