This document contains a high-level architectural overview of RustPython, thus it's very well-suited to get to know the codebase.
RustPython is an Open Source (MIT-licensed) Python 3 interpreter written in Rust, available as both a library and a shell environment. Using Rust to implement the Python interpreter enables Python to be used as a programming language for Rust applications. Moreover, it allows Python to be immediately compiled in the browser using WebAssembly, meaning that anyone could easily run their Python code in the browser. For a more detailed introduction to RustPython, have a look at this blog post.
RustPython consists of several components which are described in the section below. Take a look at this video for a brief walk-through of the components of RustPython. For a more elaborate introduction to one of these components, the parser, see this blog post for more information.
Have a look at these websites for a demo of RustPython running in the browser using WebAssembly:
If, after reading this, you want to contribute to RustPython, take a look at these sources to get to know how and where to start:
A high-level overview of the workings of RustPython is visible in the figure below, showing how Python source files are interpreted.
Main architecture of RustPython.
The RustPython interpreter can be decoupled into three distinct components: the parser, compiler and VM.
- The parser is responsible for converting the source code into tokens, and deriving an Abstract Syntax Tree (AST) from it.
- The compiler converts the generated AST to bytecode.
- The VM then executes the bytecode given user supplied input parameters and returns its result.
The main entry point of RustPython is located in src/main.rs
and simply forwards a call to run
, located in src/lib.rs
. This method will call the compiler, which in turn will call the parser, and pass the compiled bytecode to the VM.
For each of the three components, the entry point is as follows:
- Parser: The Parser is located in a separate project, RustPython/Parser. See the documentation there for more information.
- Compiler:
compile
, located invm/src/vm/compile.rs
, this eventually forwards a call tocompiler::compile
. - VM:
run_code_obj
, located invm/src/vm/mod.rs
. This creates a new frame in which the bytecode is executed.
Here we give a brief overview of each component and its function. For more details for the separate crates please take a look at their respective READMEs.
This component, implemented as the rustpython-compiler/
package, is responsible for translating a Python source file into its equivalent bytecode representation. As an example, the following Python file:
def f(x):
return x + 1
Is compiled to the following bytecode:
2 0 LoadFast (0, x)
1 LoadConst (1)
2 BinaryOperation (Add)
3 ReturnValue
Note that bytecode is subject to change, and is not a stable interface.
The Parser is the main sub-component of the compiler. All the functionality required for parsing Python sourcecode to an abstract syntax tree (AST) is implemented here:generator.
- Lexical Analysis
- Parsing
The functionality for parsing resides in the RustPython/Parser project. See the documentation there for more information.
The Virtual Machine (VM) is responsible for executing the bytecode generated by the compiler. It is implemented in the rustpython-vm/
package. The VM is currently implemented as a stack machine, meaning that it uses a stack to store intermediate results. In the rustpython-vm/
package, additional sub-components are present, for example:
- builtins: the built in objects of Python, such as
int
andstr
. - stdlib: parts of the standard library that contains built-in modules needed for the VM to function, such as
sys
.
The rustpython-common
package contains functionality that is not directly coupled to one of the other RustPython packages. For example, the float_ops.rs
file contains operations on floating point numbers
which could be used by other projects if needed.
Rust language extensions and macros specific to RustPython. Here we can find the definition of PyModule
and PyClass
along with useful macros like py_compile!
.
This folder contains a very experimental JIT implementation.
Part of the Python standard library that's implemented in Rust. The modules that live here are ones that aren't needed for the VM to function, but are useful for the user. For example, the random
module is implemented here.
Python side of the standard libary, copied over (with care) from CPython sourcecode.
CPython test suite, which can be used to compare with CPython in terms of conformance. Many of these files have been modified to fit with the current state of RustPython (when they were added), in one of three ways:
- The test has been commented out completely if the parser could not create a valid code object. If a file is unable to be parsed the test suite would not be able to run at all.
- A test has been marked as
unittest.skip("TODO: RustPython <reason>")
if it led to a crash of RustPython. Adding a reason is useful to know why the test was skipped but not mandatory. - A test has been marked as
unittest.expectedFailure
with aTODO: RustPython <reason>
comment left on top of the decorator. This decorator is used if the test can run but the result differs from what is expected.
Note: This is a recommended route to starting with contributing. To get started please take a look this blog post.
The RustPython executable/REPL (Read-Eval-Print-Loop) is implemented here, which is the interface through which users come in contact with library. Some things to note:
- The CLI is defined in the
run
function ofsrc/lib.rs
. - The interface and helper for the REPL are defined in this package, but the actual REPL can be found in
vm/src/readline.rs
Crate for WebAssembly build, which compiles the RustPython package to a format that can be run on any modern browser.
Integration and snippets that test for additional edge-cases, implementation specific functionality and bug report snippets.