Porting a Python Application to Rust

PyOxidizer can be used to gradually port a Python application to Rust. What we mean by this is that Python code in an application would slowly be rewritten in Rust.

Overview

When porting a Python application to Rust, the goal is to port Python code - and possibly Python C extension code - to Rust. Parts of the Rust code will presumably need to call into Python code and vice-versa.

When porting code to Rust, there are essentially two flavors of Rust code that will be written and executed:

  1. Vanilla Rust code
  2. Python-flavored Rust code

Vanilla Rust code is standard Rust code. It is what you would write if authoring a Rust-only project.

Python-flavored Rust code is Rust code that interacts with the Python C API. It is regular Rust code, of course, but it is littered with references to PyObject and function calls into the Python C API (although these function calls may be abstracted so you don’t have to use unsafe).

These different flavors of Rust code dictate different approaches to porting. Both flavors/approaches can be used simultaneously when porting an application to Rust.

Vanilla Rust code will supplement the boilerplate Rust code that PyOxidizer uses to define and build a standalone executable embedded Python. See Extending Rust Projects for more.

Python-flavored Rust code typically involves writing Python extension modules in Rust. In this approach, you create a Python extension modules implemented in Rust and then make them available to the Python interpreter, which is managed by a Rust project.

Extending Rust Projects

When building an application from a standalone pyoxidizer.bzl file, PyOxidizer creates and builds a temporary, boilerplate Rust project behind the scenes. This Rust project has just enough code to initialize and run an embedded Python interpreter. That’s the extent of the Rust code.

PyOxidizer also supports persistent Rust projects. In this mode, you have full control over the Rust project and can add custom Rust code to it as you desire. In this mode, you can run Rust code independent of the Python interpreter.

Supplementing the Rust code contained in your executable gives you the power to run arbitrary Rust code however you see fit. Here are some common scenarios this can enable:

  • Implementing argument parsing in Rust instead of Python. This could allow you to parse out the sub-command being invoked and dispatch to pure Rust code paths if possible, falling back to running Python code only if necessary.
  • Running a forking server, which doesn’t start a Python interpreter until an event occurs.
  • Starting a thread with a high-performance application component implemented in Rust. For example, you could run a thread servicing a high-performance logging subsystem or HTTP server implemented in Rust and have that thread interact with a Python interpreter via a pipe or some other handle.

Getting Started

To extend a Rust project with custom Rust code, you’ll first want to materialize the boilerplate Rust project used by PyOxidizer:

$ pyoxidizer init-rust-project myapp

See Rust Projects for details on the files materialized by this command.

If you are using version control, now would be a good time to add the created files to version control. e.g.:

$ git add myapp
$ git commit -m 'create boilerplate PyOxidizer project'

From here, your next steps are to modify the Rust project to do something new and different.

The auto-generated src/main.rs file contains the main() function used as the entrypoint for the Rust executable. The default file will simply instantiate a Python interpreter from a configuration, run that interpreter, then exit the process.

To extend your application with custom Rust code, simply add custom code to main(). e.g.

fn main() {
    println!("hello from Rust!")

    // Code auto-generated by ``pyoxidizer init-rust-project`` goes here.
    // ...
}

That is literally all there is to it!

To build your custom Rust project, pyoxidizer build is the most robust way to do that. But it is also possible to use cargo build.

What Can Go Wrong

pyoxidizer Not Found or Rust Code Version Mismatch

When using cargo build, the pyoxidizer executable will be invoked behind the scenes. This requires that executable to be on PATH and for the version to be compatible with the Rust code you are trying to build. (The Rust APIs do change from time to time.)

If the pyoxidizer executable is not on PATH or its version doesn’t match the Rust code, you can forcefully tell the Rust build system which pyoxidizer executable to use:

$ PYOXIDIZER_EXE=/path/to/pyoxidizer cargo build

thread 'main' panicked at 'jemalloc is not available in this build configuration'

If you see this error, the problem is that the Python interpreter configuration says to use jemalloc as the memory allocator but the Rust project was built without jemalloc support. This is likely because the default Rust project features in Cargo.toml don’t include jemalloc by default.

You can resolve this issue by either disabling jemalloc in the Python configuration or by enabling jemalloc in Rust.

To disable jemalloc, open your pyoxidizer.bzl file and find the definition of raw_allocator. You can set it to raw_allocator="system" so Python uses the system memory allocator instead of jemalloc.

To enable jemalloc, you have a few options.

First, you could build the Rust project with jemalloc support:

$ cargo build --features jemalloc

Or, you modify Cargo.toml so the jemalloc feature is enabled by default:

.. code-block:: toml
[features] default = [“build-mode-pyoxidizer-exe”, “jemalloc”]

jemalloc is typically a faster allocator than the system allocator. So if you care about performance, you may want to use it.

Implementing Python Extension Modules in Rust

If you want to port a Python application to Rust, chances are that you will need to have Rust and Python code interact with each other. A common way to do this is to implement Python extensions in Rust so that Rust code will be invoked as a Python interpreter is running.

There are two ways Rust-implemented Python extension modules can be consumed by PyOxidizer:

  1. Define them via Python packaging tools (e.g. via a setup.py file for your Python package).
  2. Define them in Rust code and register them as a built-in extension module.

Python Built Rust Extension Modules

If you’ve defined a Rust Python extension module via a Python package build tool (e.g. inside a setup.py), PyOxidizer should automatically detect said extension module as part of packaging the corresponding Python package: there is no need to take special action to tell PyOxidizer it is a Rust extension, as this is all handled by Python packaging tools invoked as part of processing your pyoxidizer.bzl file.

See Packaging User Guide for more.

The topic of authoring Python extension modules implemented in Rust is arguably outside the scope of this documentation. A search engine search for Rust Python extension should set you on the right track.

Built-in Rust Extension Modules

A Python extension module is defined as a PyInit__<name> function which is called to initialize an extension module. Typically, Python extension modules are compiled as standalone shared libraries, which are then loaded into a process, after which their PyInit__<name> function is called.

But Python has an additional mechanism for defining extension modules: built-ins. A built-in extension module is simply an extension module whose PyInit__<name> function is already present in the process address space. Typically, these are extensions that are part of the Python distribution itself and are compiled directly into libpython.

When you instantiate a Python interpreter, you give it a list of the available built-in Python extension modules. And PyOxidizer’s pyembed crate allows you to supplement the default list with custom extensions.

To use built-in extension modules implemented in Rust, you’ll need to implement said extension module in Rust, either as part of your application’s Rust crate or as part of a different crate. Either way, you’ll need to extend the boilerplate Rust project code (see Extending Rust Projects) and tell it about additional built-in extension modules. See Adding Extension Modules At Run-Time for instructions on how to do this.

The tricky part here is implementing your Rust extension module.

You probably want to use the cpython or PyO3 Rust crates for interfacing with the CPython API, as these provide an interface that is more ergonomic and doesn’t require use of unsafe { }. Use of these crates is beyond the scope of the PyOxidizer documentation.

If you attempt to use the cpython or PyO3 macros for defining a Python extension module, you’ll likely run into problems because these assume that extension modules are standalone shared libraries, which isn’t the case for built-in extension modules!

If you attempt to use a separate Rust crate to define your extension module, you may run into Python symbol issues at link time because the build system for the cpython and PyO3 crates will use their own logic for locating a Python interpreter and that interpreter may not have a configuration that is compatible with the one embedded in your PyOxidizer binary!

At the end of the day, all you need to register a built-in extension module with PyOxidizer is an extern "C" fn () -> *mut python3_sys::PyObject. Here is the boilerplate for defining a Python extension module in Rust (this uses the cpython crate).

use python3_sys as pyffi;
use cpython::{PyErr, PyModule, PyObject};

static mut MODULE_DEF: pyffi::PyModuleDef = pyffi::PyModuleDef {
    m_base: pyffi::PyModuleDef_HEAD_INIT,
    m_name: std::ptr::null(),
    m_doc: std::ptr::null(),
    m_size: std::mem::size_of::<ModuleState>() as isize,
    m_methods: 0 as *mut _,
    m_slots: 0 as *mut _,
    m_traverse: None,
    m_clear: None,
    m_free: None,
};

#[allow(non_snake_case)]
pub extern "C" fn PyInit_my_module() -> *mut pyffi::PyObject {
    let py = unsafe { cpython::Python::assume_gil_acquired() };

    unsafe {
        if MODULE_DEF.m_name.is_null() {
            MODULE_DEF.m_name = "my_module".as_ptr() as *const _;
            MODULE_DEF.m_doc = "usage docs".as_ptr() as *const _;
        }
    }

    let module = unsafe { pyffi::PyModule_Create(&mut MODULE_DEF) };

    if module.is_null() {
        return module;
    }

    let module = match unsafe { pyffi::from_owned_ptr(py, module).cast_into::<PyModule>(py) } {
        Ok(m) => m,
        Err(e) => {
            PyErr::from(e).restore(py);
            return std::ptr::null_mut();
        }
    };

    match module_init(py, &module) {
        Ok(()) => module.into_object().steal_ptr(),
        Err(e) => {
            e.restore(py);
            std::ptr::null_mut()
        }
    }
}

If you want a concrete example of what this looks like and how to do things like define Python types and have Python functions implemented in Rust, do a search for PyInit_oxidized_importer in the source code of the pyembed crate (which is part of the PyOxidizer repository) and go from there.

The documentation for authoring Python extension modules and using the Python C API is well beyond the scope of this document. A good place to start is the official documentation.