Note beforehand: Bruno Renie had several useful things to say at previous djangocons. I’d like to draw specific attention to his lightning talk last year about settings. He later wrote a more elaborate version on his own blog. Something I’m gonna take a deeper look at in the coming weeks as I’ll have to fix something in many of our internal sites :-)
He works in a diverse team and in a quite complex infrastructure, so they need to make their infrastructure visible: errors, events, metrics.
Errors: easy: use sentry.
Events: basically, a log call. The errors and sentry mentioned above are great for developers, but you often need info on the requests that don’t fail, too.
rsyslog/syslog-ng can work with logstash-forwarder (“lumberjack”) to forward everything to logstash. In python, you can use a sysloghandler to send everything to the syslog.
You have to use structured logging. A formatted string is nice for the logfile, but not if you want to do more with it. In python you can look at structlog. With the json from structlog, sent via logstash to elasticsearch, you can then make nice dashboards in kibana. He showed a demo.
With everything setup, you’ll soon get extra feature requests like “how much income did we generate in the last week” and “who signed up today”.
Metrics: time series data. Continuous, regular intervals. The big player is the graphite ecosystem. It consists of a couple of components. The “carbon line protocol”, a simple way to emit data to graphite. It even works from the command line. Graphite handles the eventual rendering.
There’s a “statsd” deamon that serves as an in-memory buffer. It sits between the emitters and graphite. It aggregates and summaries and flushes itself from time to time. It emits its data regularly to graphite, again via the same carbon line protocol.
Graphite has an easy to use API (json, png) and a countless number of dashboard apps.
What about alerts? For this they use riemann, a “metrics hub/proxy”. You can set up thresholds in there. For instance to get warnings when the disk starts getting full or the response time is too slow.
My name is Reinout van Rees and I work a lot with Python (programming language) and Django (website framework). I live in The Netherlands and I'm happily married to Annie van Rees-Kooiman.
Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):