Buildout/setuptools speed improvementΒΆ

Tags: python, buildout, django

Sometimes buildout can seem to hang almost forever, even though you’ve set a timeout. It happens when installing the project itself.

  • Buildout effectively calls python setup.py develop on your project.
  • Setuptools then first calls os.walk() and builds a complete list of files.
  • Only then does it read your MANIFEST.in and some setup.py settings to determine which files to exclude and include.
  • If you have lots of files inside your directory, this might take a long time.

The key thing here is that it first builds a complete list. This includes that BUILDOUT_DIR/var/something directory with 245.934 files. And the BUILDOUT_DIR/node_modules/ with your full npm-downloaded stack of javascript with 10.000 files. And the BUILDOUT_DIR/var/project/ symlink to some slow windows share.

Ouch. A buildout that normally takes half a minute can take two hours this way...

The setuptools bug reports:

Intermediate solution: place this monkeypatch_setuptools.py next to your setup.py:

import os

TO_OMIT = ['var', '.git', 'parts',
           'bower_components',
           'node_modules', 'eggs',
           'bin', 'develop-eggs']

orig_os_walk = os.walk

def patched_os_walk(path, *args, **kwargs):
    for (dirpath, dirnames, filenames) in orig_os_walk(path, *args, **kwargs):
        if '.git' in dirnames:
            # We're probably in our own root directory.
            print("MONKEY PATCH: omitting a few directories like var/...")
            dirnames[:] = list(set(dirnames) - set(TO_OMIT))
        yield (dirpath, dirnames, filenames)

os.walk = patched_os_walk
# ^^^ This only modifies os.walk for the duration of calling setup.py

And then import the monkeypatch right at the top of your setup.py:

from setuptools import setup
import monkeypatch_setuptools

version = '0.3.dev0'

setup(
    name='your-package',
    version=version,
    ...

It works quite well! Some notes:

  • Adjust the TO_OMIT to your needs and local conventions.
  • There’s a check if '.git' in dirnames in the monkey patch. I use that to detect whether os.walk is currently in a directory with .git, which normally means our own base directory. So it’ll only strip out directories in there and not somewhere else. Just a small safety valve. You’ll have to adjust it if you use mercurial or something else.
blog comments powered by Disqus
 
vanrees.org logo

About me

My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):