Should we only use python tools together with Django or can we use the best of both worlds? For instance, can we use ruby scripts and perl tools? Background: http://setjam.com, an online TV guide, is what he uses it all for.
Release management. They have weekly iterations: a hard pace. They go from devel to staging (where it sits for a week) to production. And everything is in git branches. Every new feature gets its own branch during development. Production and staging also have their own branch. Releasing new code is as easy as “git pull”.
Capistrano is a ruby tool for automating the deployment. Capistrano does separate, independent releases each in a separate directory. It is mature and very complete. And… you have transactions and rollbacks for your deployment!
Of course, there’s data you need to share between the various releases (database, uploaded images, etc). For that, capistrano uses a shared/ directory and a bunch of symlinks.
Maciej made capistrano-offroad, a package of capistrano utils. One thing that it does is to reset the rails-like defaults. And there are django and supervisord recipes. He wants and needs to improve it, though.
Capistrano allows you to specify dependencies: so you can say you need PIL to be installed. There are certain dependencies that he doesn’t want to handle on the OS level, as several sites might need a different version. In the end, he went with regular Makefiles to compile upstream packages that partially come directly from version control systems and that also sometimes have custom patches applied to them. The makefile uses timestamp files for dependencies and finished tarballs are stored on S3. A drawback of Makefiles is that there’s quite some magic going on behind the scenes, so you need to understand what’s happening
Process management: supervisord. Supervisord (build in python) is the
best tool for handling processes. Extremely reliable. It is easy to use
locally. It manages foreground processes, which is more robust than handling
background processes. And… it is easy to configure with a .ini
file.
Scaling in the cloud. Keep the central server count (for instance DNS server) down to a minimum. Symmetrical processes: configure the other servers to do mostly the same, that way it is easier to add extra servers. And queue jobs. django-sqs (simple queuing service) makes it easy to queue jobs.
Logfiles are written to a central mysql database and phplogcon is used to browse and filter the logfiles. Also a mail is send with a summary of the log messages, for instance a list of the top 20 tracebacks.
Future work:
Monitoring and status. Nagios or so?
Rake instead of Make.
Polish and gemify capistrano-offroad.
More info and slides will be at http://www.setjam.com/blog
My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):