When writing code for scientific computation, you often need to improve its performance, i.e. to make it run faster and/or more efficiently in terms of memory and other resource usage. I will summarize several techniques to speed up Python code in this post. Initially, this post will be a simple list with none to very short comments, but I hope I will extend it and add more personal thoughts/comments.
- Use good specialized libraries: for example, if you are going to do a lot of linear algebra computations, use NumPy/SciPy. Always try to vectorize your code when using NumPy (the same advice for Matlab programmers).
- Use PyPy (with JIT enabled) instead of CPython to run your code. Note that as of now, PyPy can only run pure Python code, that means you cannot use C-based libraries (including NumPy) in your code. The advantages of PyPy are that it does not require you to rewrite your code, and it can significantly speed up your code (by order of hundreds to thousands times compared to CPython). It is even faster than vectorized NumPy code when there are a lot of temporary variables in the computation. This is really amazing and should be your first choice if possible.
- If the execution of your code incurs a lot of re-computations of a function with the same inputs (which results in the same outputs), generally you should cache the results. The standard practice is to store the outputs in an array and modify the function so that it will look up in that array to find out whether the results have already been there. The Python library joblib will do the caching for you, automagically. It is really easy to use and does not require you to modify your function. You will appreciate it if you need to write code for dynamic programming.
- Use a tool/library to translate your code to binary code (often via C code). There are a few of them, e.g. Cython, Weave, Pyrex, etc. My favorite is Cython. Basically, you only need to provide static typing for variables in your code (especially loop variables) and Cython will take care of the rest. It does not require much effort and the result is great.
- Write your code in C/C++ and call it from Python. This approach is usually time-consuming and should only be the last resort. In most cases, you can achieve performance near that of carefully written C/C++ code with one or a combination of the approaches above.