Thread profiling in Python - Amjith Ramanujam

Tags: django, djangocon, python

Amjith Ramanujam recently wrote a thread profiler in Python and it was rediculously simple. He works for New Relic, which is all about performance and expecially performance measuring.

Python comes with batteries included, so it also includes a C profiler that works pretty well. But it doesn’t work nice for django because the output is so huge. If you use the GUI RunSnakeRun, it is more managable that way.

Additional problem with cProfile: it has about 100% overhead, so you can’t run it in production (which they need).

You can do more targeted profiling. For instance in Django. The way a web framework processes requests is normally always in the same way. You can use that during profiling.

There are two important stages: interrupt and inquire.

Interrupt

A statistical profiler looks how often a function is called and by who. For this it needs to interrupt the regular process. You could set an OS-level signal to call your profiler every x miliseconds so that it can do something. It only works in linux, btw. Another way is to create a python background thread that wakes up every x miliseconds. It is cross-platform and mod_wsgi compatible. It is less accurate for CPU tasks and it cannot interrupt C extensions.

Inquire

You can use sys._current_frames() to get yourself the frames (“stack trace”) from the current thread. Here you can extract the filename, line number and so of the most interesting frames. He’s building a “call tree” structure, mapping how often a function is called, preserving the parent/child relation between functions. The tricky part was visualizing it :-) In the end they did it with d3.js.

There’s some 3% overhead in the thread approach, so that’s real good. They can switch it on for a type of request and it’ll profile the next 100 requests of that type.

 
vanrees.org logo

Reinout van Rees

My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):