You might have heard about PyPy, that it is an implementation of Python in Python, and that it can run Python programs faster than CPython. Then you might ask:
- How the hell is that possible? CPython runs Python programs slower than CPython runs a Python interpreter which runs Python programs?
- What if I use CPython to run PyPy to run PyPy … (n times) … to run a Python program? Will it be infinitely faster than CPython?
In this post, I will try, to the best of my knowledge, to explain PyPy and the ideas behind it, and answer those questions.
So what is PyPy?
PyPy is an implementation of the Python language in a restricted version of the Python language called RPython. RPython is a subset of the full Python language and it is statically typed. The second property is very important to the success of PyPy. There are several ways to use PyPy:
- Because RPython is a subset of Python, PyPy interpreter can be run on CPython. So you can use CPython to execute the PyPy interpreter to run your Python programs.
- Because RPython is statically typed, code written in it (in particular the PyPy interpreter) can be compiled into machine code or binary code running on virtual machines. PyPy includes translators that can translate the PyPy interpreter into C code, CLI code (.NET), Java code, etc. Therefore, you can easily have a PyPy-based Python interpreter running directly on your machine, or CLI, or JVM (Java Virtual Machine). The C-compiled PyPy interpreter is therefore equivalent to CPython, although it is less efficient. You can then use these compiled interpreters to run your Python programs.
How can PyPy run Python programs faster than CPython?
To be precise, PyPy alone cannot run your Python programs faster than CPython. This is very clear, especially if you use method 1 (above) which will be many times slower than CPython. Even if you use method 2, since compiled PyPy is less efficient than CPython, it is still slower.
However, because of how PyPy has been developed, it is not difficult to implement a Just-In-Time (JIT) compiler for PyPy. When the JIT compiler is used with PyPy, it will translate and compile your Python code into machine code in memory, then let your computer run the code directly. Of course, this will result in your code being executed much faster than it being interpreted by CPython. In contrast, it is difficult to develop a JIT compiler in CPython.
Therefore, to be precise, PyPy + JIT compiler can run your Python code faster than CPython can interpret your code.
To answer the second question above, using the chain “CPython -> PyPy -> PyPy -> … -> your code” will not make your code run fast.
What are the advantages of PyPy?
- Because PyPy implements the Python language in a high level language, i.e. RPython, instead of a low level language like C, it is easier to develop the interpreter, add new features, fix bugs, etc. Also, its code is less likely to have bugs and more robust.
- From one PyPy source code, we can have many different interpreters on different platforms (machine code, CLI, Java, and probably more). You don’t have to develop separate Python interpreters for various platforms, as the case of CPython and Jython.
- Because of its JIT compiler, PyPy can run Python programs faster, close to the performance of the same programs written in C or Java. This is a very big advantage, especially for computationally expensive programs. However, numpy is currently incompatible with PyPy, but there is a plan to change this.
- You can achieve fast performance without having to change your current code. This is also a big advantage.
I am sure PyPy will be very hot in the future, and it is definitely worth to try it now, or at least keep an eye on it.