Working with Python Extension Modules

Python extension modules are machine native code exposing functionality to a Python interpreter via Python modules.

PyOxidizer has varying levels of support for extension modules. This is because some PyOxidizer configurations break assumptions about how Python interpreters typically run.

This document attempts to capture all the nuances of working with Python extension modules with PyOxidizer.

Extension Module Flavors

Python extension modules exist as either built-in or standalone. A built-in extension module is statically linked into libpython and a standalone extension module is a shared library that is dynamically loaded at run-time.

Typically, built-in extension modules only exist in Python distributions (and are part of the Python standard library by definition) and Python package maintainers only ever produce standalone extension modules (e.g. as .so or .pyd files).

Python distributions typically contain a mix of built-in and standalone extension modules. e.g. the _ast extension module is built-in and the _ssl extension module is standalone.

Important

Because PyOxidizer enables you to build your own binaries embedding Python and because different Python distributions have different levels of support for extension modules, it is important to familiarize yourself with the types of extension modules and how they can be used.

Extension Module Restrictions

PyOxidizer imposes a handful of restrictions on how extension modules work. These restrictions are typically a side-effect of limitations of the Python distribution being used/targeted. These restrictions are documented in the sections below.

Known Incompatibility with Cython

PyOxidizer currently has a known incompatibility with Cython. Specifically, PyOxidizer fails to find object files that Cython builds. This can lead to missing symbols and build/link time errors.

This is tracked by https://github.com/indygreg/PyOxidizer/issues/567.

musl libc Linux Distributions Only Support Built-in Extension Modules

The Python distributions built against musl libc (build target *-linux-musl) only support built-in extension modules.

This is because musl libc binaries are statically linked and statically linked Linux binaries are incapable of calling dlopen() to load a shared library.

This means Python binaries built in this configuration cannot load standalone Python extension modules existing as separate files (.so files typically). This means PyOxidizer cannot consume Python wheels or other Python resource sources containing pre-built Python extension modules.

In order for PyOxidizer to support a Python extension module built for musl libc, it must compile that extension module from source and link the resulting object files / static library directly into the built binary and expose that extension module as a built-in. This is done using Building with a Custom Distutils.

Windows Static Distributions Only Support Built-in Extension Modules

The Windows standalone_static distribution flavor only supports built-in extension modules and doesn’t support loading shared library extension modules.

See the above section for implications on this.

The situation of having to rebuild Python extension modules on Windows is often more complicated than on Linux because oftentimes building extension modules on Windows isn’t as trivial as on Linux. This is because many Windows environments don’t have the correct version of Visual Studio or various library dependencies. If you want a turnkey experience for Windows packaging, it is recommended to use the standalone_dynamic distribution flavor.

Loading Extension Modules from in-memory Location

When you attempt to add a PythonExtensionModule Starlark instance to the in-memory resource location, the request may or may not work depending on the state of the extension module and support from the Python distribution.

The in-memory resource location is interpreted by PyOxidizer as load this extension from memory, without having a standalone file. PyOxidizer will try its hardest to satisfy this request.

If the object files / static library of an extension module are known to PyOxidizer, these will be statically linked into the built binary and the extension module will be exposed as a built-in extension module.

If only a shared library is available for the extension module, PyOxidizer only supports loading shared libraries from memory on Windows standalone_dynamic distributions: in all other platforms the request to load a shared library extension module is rejected.

Some extensions and shared libraries are known to not work when loaded from memory using the custom shared library loader used by PyOxidizer. For this reason, PythonPackagingPolicy.allow_in_memory_shared_library_loading exists to control this behavior.

Important

Because the in-memory location for extension modules can be brittle, it is recommended to set a resources policy or add_location_fallback to allow extension modules to exist as standalone files. This will provide maximum compatibility with built Python extension modules and will reduce the complexity of packaging 3rd party extension modules.

Extension Module Library Dependencies

PyOxidizer doesn’t currently support resolving additional library dependencies from discovered extension modules outside of the Python distribution. For example, if your extension module foo.so has a run-time dependency on bar.so, PyOxidizer doesn’t yet detect this and doesn’t realize that bar.so needs to be handled.

This means that if you add a PythonExtensionModule Starlark type and this extension module depends on an additional library, PyOxidizer will likely not realize this and fail to distribute that additional library dependency with your application.

If your Python extensions depend on additional libraries, you may need to manually add these files to your installation via custom Starlark code.

Note that if your shared library exists as a file in Python package (a directory with __init__.py somewhere in the hierarchy), PyOxidizer’s resource scanning may detect the shared library as a PythonPackageResource and package this resource. However, the packaged resource won’t be flagged as a shared library. This means that the run-time importer won’t identify the shared library dependency and won’t take steps to ensure it is available/loaded before the extension is loaded. This means that the shared library loading needs to be handled by the operating system’s default rules. And this means that the shared library file must exist on the filesystem, next to a file-based extension module.

Building with a Custom Distutils

If PyOxidizer is not able to reuse an existing shared library extension module or the build configuration is forcing an extension to be built as a built-in, PyOxidizer attempts to compile the extension module from source so that it can be statically linked as a built-in.

The way PyOxidizer achieves this is a bit crude, but often effective.

When PyOxidizer invokes pip or setup.py to build a package, it installs a modified version of distutils into the invoked Python’s sys.path. This modified distutils changes the behavior of some key build steps (notably how C extensions are compiled) such that the build emits artifacts that PyOxidizer can statically link into a custom binary.

For example, on Linux, PyOxidizer copies the intermediate object files produced by the build and links them into the binary containing the generated libpython. PyOxidizer completely ignores the shared library that is or would typically be produced.

If setup.py scripts are following the traditional pattern of using distutils.core.Extension to define extension modules, things tend to just work (assuming extension modules are supported by PyOxidizer for the target platform). However, if setup.py scripts are doing their own monkeypatching of distutils, rely on custom build steps or types to compile extension modules, or invoke separate Python processes to interact with distutils, things may break.

The easiest way to avoid the pitfalls of a custom distutils build is to not attempt to produce a statically linked binary: use a standalone_dynamic distribution flavor that supports loading extension modules from files.

Until PyOxidizer supports telling it additional object files or static libraries to link into a binary, there’s no easy workaround aside from giving up on a statically linked binary. Better support will hopefully be present in future versions of PyOxidizer.