PyPy advertises itself as "a fast, compliant alternative implementation of the Python language", and it has a slick speedtest site to back up its claims. Speed is great of course, but what's really interesting to me are the details of its implementation. In the process of building a new Python interpreter, the PyPy team have created an powerful generic toolkit for constructing dynamic language interpreters, and as a result the PyPy project comes in two largely-independent halves.
First there is the PyPy interpreter itself, which is written entirely in Python. To be more specific, it's written in a restricted subset of the language called RPython, which keeps many of the niceties of the full Python language while enabling efficient ahead-of-time compilation. This allows for greater ease and flexibility of development than implementing the interpreter directly in C, as is done with the standard interpreter available from python.org.
Second, there is the RPython translation toolchain, which provides a dazzling array of different methods and options for turning RPython code into an executable. It can translate RPython into low-level C code for direct compilation, or into higher-level bytecode for the Java and .NET virtual machines. It can plug in any one of several different memory-management schemes, threading implementations, and a host of other options to customize the final executable.
The RPython toolchain also contains the secret to PyPy's speed: the ability to mostly-automatically generate a just-in-time compiler for the hot loops of an RPython program. It's meta-level magic of the deepest sort, and it's exactly the kind of thing that would be needed to get decent performance out of a Python interpreter running on the web.
To the great credit of the PyPy and Emscripten developers, combining these two technologies was almost as easy in practice as it sounds in theory. PyPy's RPython toolchain has extension points that let you easily plug in a custom compiler, or indeed a whole new toolchain. My github fork contains the necessary logic to hook it up to Emscripten:
Emscripten goes out of its way to act like a standard posix build chain, asking only that you replace the usual "gcc" invocation with "emcc". I did have to make a few tweaks to the simulated posix runtime environment, so you'll need to use my fork until these are merged with upstream:
To compile RPython code into a normal executable, you invoke the "rpython" translator program on it. Here's a simple hello-world example that can be run out-of-the-box from the PyPy source repo:
$> python ./rpython/bin/rpython ./rpython/translator/goal/targetnopstandalone.py [...lots and lots of compiler output...] $> $> ./targetnopstandalone-c debug: hello world $>
$> python ./rpython/bin/rpython --backend=js ./rpython/translator/goal/targetnopstandalone.py [...lots and lots of compiler output...] $> $> node ./targetnopstandalone-js debug: hello world $>
$> python ./rpython/bin/rpython --backend=js --opt=2 ./pypy/goal/targetpypystandalone.py [...seriously, this will take forever...] ^C $>
Or you can just grab the end result: pypy.js.
$> node pypy.js -c 'print "HELLO WORLD"' debug: WARNING: Library path not found, using compiled-in sys.path. debug: WARNING: 'sys.prefix' will not be set. debug: WARNING: Make sure the pypy binary is kept inside its tree of files. debug: WARNING: It is ok to create a symlink to it from somewhere else. 'import site' failed HELLO WORLD $>
As you might expect, this first version comes with quite a list of caveats:
- There's no JIT compiler. I explicitly disabled it by passing in the "--opt=2" option above. Producing a JIT compiler will require some platform-specific support code and I haven't really got my head around what that might look like yet.
- There's no filesystem access, which causes debug warnings to be printed at startup. There is work taking place to extend Emscripten with a pluggable virtual filesystem, which should enable native file access at some point in the future.
- Instead, it uses a bundled snapshot of the filesystem to provide the Python standard library. This makes startup very very slow, as the whole snapshot gets unpacked into memory before entering the main loop of the interpreter.
- There's no interactive console. Output works fine, but input not so much. I simply haven't dug into the details of this yet, but it shouldn't be too hard to get something rudimentary working.
- Lots of builtin modules are missing, because they require additional C-level dependencies. For example, the "hashlib" module depends on OpenSSL. I'll work on adding them, one by one.
- I most certainly haven't put a slick browser-based UI on top of it like repl.it.
So no, you probably can't run this in your browser right now. But it is a real Python interpreter and it can execute real Python commands. To get all that in exchange for a little bit of glue code, seems pretty awesome to me.
The big question of course is, how does it perform? To analyze this I turned to the Python community's most favorite and unscientific benchmark: pystone. This is a pointless little program that exercises the Python interpreter through a number of loops and gives it a speed result in "pystones per second". Here are the results from the various Python interpreters I have sitting on my machine; higher numbers are better:
|pypy.js, on node||877|
|pypy.js, on spidermonkey||7427|
|native pypy, no JIT||53418|
|native pypy, with JIT||781250|
Faster still is the native Python interpreter that came with my system, CPython 2.7.4. This is an important point that sometimes gets forgotten: without its JIT, the PyPy interpreter can often be slower than the standard CPython interpreter. That's currently the price it pays for the flexibility of its implementation, but things need not stay that way – the PyPy developers are always on the lookout for ways to speed up their interpreter even in the absence of its JIT.
Unsurprisingly, the speed king here is a native build of PyPy with its JIT-compilation features enabled.
|rpystone.js, on spidermonkey||13531802|
Compared to the interpreted-pystone results above, these numbers are astonishingly high. So much so that I suspect they're not entirely accurate, and are being skewed by some difference between the RPython version of pystone and the standard one. But that's not really the point anyway.
Will it JIT?
This kind of inter-asmjs-module linking is a tentative item on the Emscripten roadmap, but it's not clear how much overhead it would entail. If the cost of all this jumping back-and-forth were too high, it could easily swamp any performance benefit that the JITed code might bring.
It's not yet clear to me exactly how limiting this will be. If it's something that can be worked around at the cost of some efficiency, e.g. by adding additional checks and flag variables into the generated code, then maybe we can still get some value out of the JIT. But if this kind of dynamic code-patching is fundamental to the operation of the JIT then we may be plain out of luck.
Ultimately, someone just needs to try it and see. Assuming I can find the time, I plan to give it a shot.
PyPy with JIT is often reported to be six or more times faster than CPython on some benchmarks. And we've seen that asm.js code can run less than three times slower than native code. Combine those two numbers, and here's my lofty, crazy, good-for-motivation-but-likely-futile goal for the rest of the year:
pypy.js, running in the spidermonkey shell, getting more pystones per second than the native CPython interpreter.
Possible? I've no idea. But it's going to be fun finding out!
Any gamblers in the audience can direct their enquiries to Brendan Eich.