Pycon NL: Past, present and future of python parallelism - Pavel Filonov

Tags: pycon, python

(One of my summaries of the one-day Pycon NL conference in Utrecht, NL).

In his job, he’s waiting a lot. Waiting for his machine learning stuff to finish. Waiting for results to be calculated. In c++ you’d start enabling multiple processing to get your entire processor to work. How how about python?

He showed a simple function to calculate prime numbers. A CPU bound problem. Everything he tests is run on an AWS t3.xlarge 4 CPU machine,

The present

Just simple single core. Pure non-parallel python. 1430 ms to calculate.

You can use python extensions: you code it in c++ and use some library for splitting it up in multiple tasks. 25 ms, 60x faster.

You can use python’s own multithreading, using ThreadPoolExecutor. 1670 ms: wow, it is a little bit slower. Which is to be expected as the problem is CPU bound.

Similar approach, with multiprocessing. ProcessPoolExecutor with three CPUs as worker. 777 ms, about twice as fast. Many python data processing libraries support multiprocessing. It is a good approach.

The future

You can look at multiple interpreters. PEP 734. And there’s a PEP to have a GIL (“global interpreter lock”) per interpreter. If you start up multiple interpreters, you can get a speedup. (He showed some code with “subinterpreters”, an experimental feature in python 3.13). 907 ms, so a bit faster. But note that the subinterpreter work is in an early stage of development, lots of improvements are possible. Especially passing information along between the subinterpreters is tricky.

Optional GIL. Partially, the future is already here as python 3.13 supports this. Apparently osx/windows builds of 3.13 contain a pythont executable in addition to python, the ‘t’ variant excludes the GIL.

He tried the multithreading script again, now without GIL. 941 ms. Better than the with-GIL multithreading, but not spectacular.

He showed why the non-GIL option is not standard: it actually makes regular single-threaded code slower: 1900 ms, longer than the original 1670 ms.

 
vanrees.org logo

Reinout van Rees

My name is Reinout van Rees and I program in Python, I live in the Netherlands, I cycle recumbent bikes and I have a model railway.

Weblog feeds

Most of my website content is in my weblog. You can keep up to date by subscribing to the automatic feeds (for instance with Google reader):