Skip to content

feat: add native C extension support for numpy, pandas, Pillow, rapidfuzz, wordcloud#645

Open
ppenna wants to merge 1 commit into
nanvix/v3.12.3from
feature/native-extensions
Open

feat: add native C extension support for numpy, pandas, Pillow, rapidfuzz, wordcloud#645
ppenna wants to merge 1 commit into
nanvix/v3.12.3from
feature/native-extensions

Conversation

@ppenna
Copy link
Copy Markdown

@ppenna ppenna commented May 14, 2026

Summary

Add builtin shim modules and build infrastructure for cross-compiling Python C extensions as statically-linked modules for Nanvix (i686).

Extensions added

Package Modules Description
numpy 1 _np_multiarray_umath — core array/math engine
pandas 44 _pd_* — tslibs, parsers, window, ops, etc.
Pillow 3 _pil_imaging, _pil_imagingmath, _pil_imagingmorph
rapidfuzz 6 _rf_* — fuzz, distance metrics, feature detection
wordcloud 1 via existing lxml build infrastructure

Architecture

Each C extension is wrapped in a _*_builtin.c shim that bridges the PyInit_X function to a flat builtin name for CPython's inittab. Build helpers (.nanvix/*.py) generate Setup.local entries and link flags.

Result

CPython now has 135 built-in modules (up from 91), producing a 33MB python.elf.

Benchmarks (host-side wall clock)

Test Time
Hello world (baseline) ~0.26s
NumPy import ~0.73s
NumPy compute (10k ops) ~0.75s
Pandas import ~2.09s
Pandas compute (1k DataFrame) ~2.11s

Related PRs

  • nanvix/nanvix-python — build integration & tests
  • nanvix/pandas — cross-compilation scripts
  • nanvix/numpy — cross-compilation scripts
  • nanvix/Pillow — cross-compilation scripts

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds CPython-side glue to expose numpy, pandas, Pillow, rapidfuzz, and wordcloud C/C++ extensions as statically-linked builtin modules for the Nanvix i686 target. Each extension gets a _*_builtin.c shim that re-exports its PyInit_* under a flat name matching Modules/Setup.local, plus a .nanvix/<pkg>.py helper that generates the Setup.local entries (and for some packages, stages the corresponding Python sources into the sysroot). .nanvix/build.py now aggregates these into a single Setup.local generator.

Changes:

  • Add 55 new _*_builtin.c shim files in Modules/ that bridge each upstream PyInit_* to a flat builtin name.
  • Add .nanvix/{numpy_mod,pandas_mod,pillow,rapidfuzz,wordcloud}.py build helpers and refactor .nanvix/lxml.py to expose a generate_setup_local_lines API used by a new combined generator in .nanvix/build.py.
  • Remove nanvix.lock from .nanvix/.gitignore.
Show a summary per file
File Description
Modules/_np_multiarray_umath_builtin.c numpy core builtin shim
Modules/pd*_builtin.c (44 files) pandas builtin shims for _libs, tslibs, window
Modules/_pil_imaging{,math,morph}_builtin.c Pillow builtin shims
Modules/rf*_builtin.c (7 files) rapidfuzz builtin shims
.nanvix/numpy_mod.py Generates numpy Setup.local lines; defines (uncalled) staging helper
.nanvix/pandas_mod.py Generates pandas Setup.local lines for 44 modules
.nanvix/pillow.py Generates Pillow Setup.local lines
.nanvix/rapidfuzz.py Generates rapidfuzz Setup.local lines and (uncalled) staging + Python shim writer
.nanvix/wordcloud.py Generates wordcloud Setup.local entry referencing a missing C shim; (uncalled) staging helper
.nanvix/lxml.py Refactored to expose generate_setup_local_lines
.nanvix/build.py Aggregates Setup.local lines from all extension helpers
.nanvix/.gitignore Drops nanvix.lock from ignored entries

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 63/63 changed files
  • Comments generated: 3

Comment thread .nanvix/wordcloud.py
Comment on lines +14 to +22

_SETUP_LOCAL_LINES = """\
# wordcloud Cython extension module (statically linked via pre-built archive).
_wc_query_integral_image _wc_query_integral_image_builtin.c -L{sysroot}/lib -lquery_integral_image
"""


def generate_setup_local_lines(sysroot: Path) -> str:
"""Return Setup.local lines for wordcloud modules."""
Comment thread .nanvix/rapidfuzz.py
Comment on lines +35 to +62
def stage_rapidfuzz_runtime(repo_root: Path, sysroot: Path) -> None:
"""Copy rapidfuzz Python files from buildroot into the sysroot.

Looks for rapidfuzz in ``.nanvix/buildroot/python-packages/rapidfuzz/``.
Then writes Python shim modules that bridge the flat builtin names
to the expected ``rapidfuzz.*`` / ``rapidfuzz.distance.*`` import paths.
"""
rf_src = repo_root / ".nanvix" / "buildroot" / "python-packages" / "rapidfuzz"
if not rf_src.is_dir():
print(
f"[rapidfuzz] Python package not found at {rf_src}; "
"skipping runtime staging."
)
return

py_lib = sysroot / "lib" / config.PYTHON_LIB_DIR
if not py_lib.is_dir():
raise RuntimeError(f"Python runtime library directory is missing: {py_lib}")

dst = py_lib / "rapidfuzz"
if dst.exists():
shutil.rmtree(dst)
shutil.copytree(rf_src, dst)

# Write Python shims that bridge flat builtin names to package paths
_write_shims(dst)

print(f"[rapidfuzz] Staged {rf_src} -> {dst}")
Comment thread .nanvix/.gitignore
black.toml
env.json
nanvix.lock
pyrightconfig.json
…fuzz, wordcloud

Add builtin shim modules and build infrastructure for cross-compiling
Python C extensions as statically-linked modules for Nanvix (i686):

- numpy: 1 module (_np_multiarray_umath)
- pandas: 44 modules (_pd_* covering _libs, tslibs, window, parsers, etc.)
- Pillow: 3 modules (_pil_imaging, _pil_imagingmath, _pil_imagingmorph)
- rapidfuzz: 6 modules (_rf_* covering fuzz, distance metrics, etc.)
- wordcloud: uses existing lxml build infrastructure

Build helpers added:
- numpy_mod.py: numpy extension build configuration
- pandas_mod.py: pandas extension build configuration
- pillow.py: Pillow extension build configuration
- rapidfuzz.py: rapidfuzz extension build configuration
- wordcloud.py: wordcloud extension build configuration

Each C extension is wrapped in a _*_builtin.c shim that bridges
the PyInit function to a flat builtin name for CPython's inittab.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@ppenna ppenna force-pushed the feature/native-extensions branch from e9aec55 to 3560d6b Compare May 14, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants