Buildout/setuptools speed improvement¶
Sometimes buildout can seem to hang almost forever, even though you’ve set a timeout. It happens when installing the project itself.
Buildout effectively calls
python setup.py developon your project.Setuptools then first calls
os.walk()and builds a complete list of files.Only then does it read your
MANIFEST.inand somesetup.pysettings to determine which files to exclude and include.If you have lots of files inside your directory, this might take a long time.
The key thing here is that it first builds a complete list. This includes that
BUILDOUT_DIR/var/something directory with 245.934 files. And the
BUILDOUT_DIR/node_modules/ with your full npm-downloaded stack of
javascript with 10.000 files. And the BUILDOUT_DIR/var/project/ symlink to
some slow windows share.
Ouch. A buildout that normally takes half a minute can take two hours this way…
The setuptools bug reports:
https://bitbucket.org/pypa/setuptools/issues/249/have-a-way-to-ingore-specific-dirs-when
https://bitbucket.org/pypa/setuptools/issues/450/egg_info-command-is-very-slow-if-there-are
Intermediate solution: place this monkeypatch_setuptools.py next to your
setup.py:
import os
TO_OMIT = ['var', '.git', 'parts',
'bower_components',
'node_modules', 'eggs',
'bin', 'develop-eggs']
orig_os_walk = os.walk
def patched_os_walk(path, *args, **kwargs):
for (dirpath, dirnames, filenames) in orig_os_walk(path, *args, **kwargs):
if '.git' in dirnames:
# We're probably in our own root directory.
print("MONKEY PATCH: omitting a few directories like var/...")
dirnames[:] = list(set(dirnames) - set(TO_OMIT))
yield (dirpath, dirnames, filenames)
os.walk = patched_os_walk
# ^^^ This only modifies os.walk for the duration of calling setup.py
And then import the monkeypatch right at the top of your setup.py:
from setuptools import setup
import monkeypatch_setuptools
version = '0.3.dev0'
setup(
name='your-package',
version=version,
...
It works quite well! Some notes:
Adjust the
TO_OMITto your needs and local conventions.There’s a check
if '.git' in dirnamesin the monkey patch. I use that to detect whether os.walk is currently in a directory with.git, which normally means our own base directory. So it’ll only strip out directories in there and not somewhere else. Just a small safety valve. You’ll have to adjust it if you use mercurial or something else.