Skip to content

feat: add native C extension integration and test suite for 23 Python packages#211

Open
ppenna wants to merge 1 commit into
mainfrom
feature/native-extensions
Open

feat: add native C extension integration and test suite for 23 Python packages#211
ppenna wants to merge 1 commit into
mainfrom
feature/native-extensions

Conversation

@ppenna
Copy link
Copy Markdown
Contributor

@ppenna ppenna commented May 14, 2026

Summary

Integrate cross-compiled C extensions (numpy, pandas, Pillow, rapidfuzz, wordcloud) into the Nanvix Python build system and add comprehensive functional tests for all 23 target packages.

Changes

Build system (.nanvix/z.py)

  • Add _install_pandas_python() to copy pandas 3.x source into sysroot
  • Invoke pandas/numpy Python file installation during build and test phases

Patches for runtime compatibility

  • numpy/_init_.py: meta-path finder redirecting dotted imports to flat builtins
  • pandas/_init_.py: meta-path finder resolving Cython circular imports (44 modules)
  • pandas/api/, pandas/core/: compatibility shims for 3.x source
  • PIL/: Nanvix compatibility for Image, ImageFont, ImageDraw, ImageFilter
  • matplotlib/, scipy/, cryptography/, psutil/, pypdfium2/: stub modules (graceful degradation)

Functional tests (tests/func/)

  • test_118 through test_140: import/smoke tests for all 23 packages
  • test_150-154: benchmarks (hello, numpy import/compute, pandas import/compute)

Dependencies

  • Updated site-packages-extra.txt with all pure-Python packages

Test results

Results: 39 passed, 1 failed (plotnine — needs matplotlib C modules), 1 skipped (seaborn)

Related PRs

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends Nanvix’s Python build/test pipeline to support a broader set of packages by combining (a) copied Python “wrapper” trees for statically-linked native extensions (e.g., lxml/rapidfuzz/wordcloud/Pillow/numpy/pandas) with (b) pure-Python shim packages under patches/, and adds functional smoke/benchmark tests for the newly targeted ecosystem packages.

Changes:

  • Expand .nanvix/z.py to install shim packages from patches/ and to copy Python wrapper files for several statically-linked native extensions into site-packages during build/test.
  • Add functional tests for many additional packages (imports + minimal API checks) plus benchmark-style tests for hello/numpy/pandas.
  • Update requirements/site-packages-extra.txt to include additional pure-Python dependencies needed by the new tests/packages.
Show a summary per file
File Description
.nanvix/z.py Installs shim packages from patches/ and copies Python wrappers for statically-linked extension packages into the sysroot during build/test.
.nanvix/.gitignore Updates ignored build artifacts under .nanvix/.
requirements/site-packages-extra.txt Adds/pins extra pure-Python dependencies for document/media/visualization packages.
tests/func/test_118_numpy.py Functional smoke test for native NumPy availability/basic ops.
tests/func/test_120_scipy.py Functional smoke test for SciPy shim surface (stats).
tests/func/test_121_matplotlib.py Functional smoke test for matplotlib shim surface (pyplot basics).
tests/func/test_122_pillow.py Functional smoke test for statically-linked Pillow C extension + basic Image.new.
tests/func/test_123_seaborn.py Functional smoke test for seaborn import with graceful degradation.
tests/func/test_124_plotnine.py Functional smoke test for plotnine import/version.
tests/func/test_125_altair.py Functional smoke test for altair import/version.
tests/func/test_126_psutil.py Functional smoke test for psutil shim API surface.
tests/func/test_127_cryptography.py Functional smoke test for cryptography shim (Fernet + hashes).
tests/func/test_128_rapidfuzz.py Functional smoke test for rapidfuzz (C++ backend when available) and basic functions.
tests/func/test_129_wordcloud.py Functional smoke test for wordcloud import + native extension presence.
tests/func/test_130_pypdfium2.py Functional smoke test for pypdfium2 shim API surface.
tests/func/test_131_pydantic.py Functional smoke test for pydantic v1 model validation.
tests/func/test_132_pdfminer_six.py Functional smoke test for pdfminer.six import/high-level APIs.
tests/func/test_133_pdfplumber.py Functional smoke test for pdfplumber import/version.
tests/func/test_134_python_docx.py Functional smoke test for python-docx with lxml dependency gating.
tests/func/test_135_reportlab.py Functional smoke test for reportlab constants.
tests/func/test_136_pyperclip.py Functional smoke test for pyperclip stub import/copy/paste availability.
tests/func/test_137_moviepy.py Functional smoke test for moviepy import-only scenario.
tests/func/test_138_ffmpeg_python.py Functional smoke test for ffmpeg-python import-only scenario.
tests/func/test_139_pytesseract.py Functional smoke test for pytesseract import-only scenario.
tests/func/test_140_lxml.py Functional smoke test for lxml etree parsing/building with statically-linked C modules.
tests/func/test_150_bench_hello.py Benchmark-style “hello” test emitting PASS token.
tests/func/test_151_bench_numpy_import.py Benchmark-style NumPy import test emitting PASS token.
tests/func/test_152_bench_numpy_compute.py Benchmark-style NumPy compute test emitting PASS token.
tests/func/test_153_bench_pandas_import.py Benchmark-style pandas import test emitting PASS token.
tests/func/test_154_bench_pandas_compute.py Benchmark-style pandas compute test emitting PASS token.
patches/scipy/__init__.py Adds a SciPy shim package root with a stubbed version.
patches/scipy/stats.py Adds SciPy stats API stubs used by downstream packages.
patches/scipy/optimize.py Adds SciPy optimize API stubs.
patches/scipy/interpolate.py Adds SciPy interpolate API stubs.
patches/scipy/spatial.py Adds SciPy spatial API stubs.
patches/pypdfium2/__init__.py Adds a pypdfium2 shim to satisfy import-time expectations.
patches/psutil/__init__.py Adds a psutil shim with static system/process info.
patches/PIL/ImageFont.py Updates the PIL shim font metrics implementation and adds TransposedFont.
patches/PIL/ImageFilter.py Adds a PIL ImageFilter stub module.
patches/PIL/ImageDraw.py Adds a PIL ImageDraw stub module.
patches/PIL/Image.py Adds PIL constants expected by downstream code.
patches/pandas/__init__.py Adds a minimal pandas shim (DataFrame/Series + misc compat).
patches/pandas/api/__init__.py Adds pandas api stub package root.
patches/pandas/api/types.py Adds pandas api.types stubs used by downstream packages.
patches/pandas/core/__init__.py Adds pandas core stub package root.
patches/pandas/core/groupby/__init__.py Adds a pandas groupby stub export for DataFrameGroupBy.
patches/numpy/__init__.py Adds a large pure-Python numpy shim implementing many common APIs.
patches/matplotlib/__init__.py Adds a matplotlib shim package root (backend config + rcParams + submodule exposure).
patches/matplotlib/pyplot.py Adds a no-op matplotlib.pyplot shim with Figure/Axes + many stubbed operations.
patches/matplotlib/transforms.py Adds matplotlib transforms stubs needed for import-time wiring.
patches/matplotlib/ticker.py Adds matplotlib ticker formatter/locator stubs.
patches/matplotlib/scale.py Adds matplotlib scale stubs.
patches/matplotlib/path.py Adds matplotlib path stubs.
patches/matplotlib/patches.py Adds matplotlib patches stubs.
patches/matplotlib/markers.py Adds matplotlib markers stubs.
patches/matplotlib/lines.py Adds matplotlib lines stubs.
patches/matplotlib/legend.py Adds matplotlib legend stubs.
patches/matplotlib/gridspec.py Adds matplotlib gridspec stubs.
patches/matplotlib/figure.py Adds matplotlib figure stub.
patches/matplotlib/dates.py Adds matplotlib dates stubs.
patches/matplotlib/colors.py Adds matplotlib colors stubs.
patches/matplotlib/collections.py Adds matplotlib collections stubs.
patches/matplotlib/cm.py Adds matplotlib cm stubs/registry.
patches/matplotlib/cbook.py Adds matplotlib cbook utility stubs.
patches/matplotlib/axis.py Adds matplotlib axis stubs.
patches/matplotlib/axes.py Adds matplotlib axes stub.
patches/matplotlib/artist.py Adds matplotlib artist stubs.
patches/cryptography/__init__.py Adds cryptography shim package root with basic exceptions/version.
patches/cryptography/fernet.py Adds a minimal Fernet implementation for import compatibility.
patches/cryptography/hazmat/__init__.py Adds cryptography.hazmat stub package root.
patches/cryptography/hazmat/backends/__init__.py Adds backend stubs (default backend).
patches/cryptography/hazmat/primitives/__init__.py Adds primitive/hash stubs (Hash wrapper).
patches/cryptography/hazmat/primitives/ciphers/__init__.py Adds cipher/algorithm/mode stubs.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 73/74 changed files
  • Comments generated: 16

Comment thread .nanvix/.gitignore
Comment on lines 4 to 5
sysroot/
buildroot/
Comment thread .nanvix/z.py
Comment on lines 535 to +553
def _install_pil_shim(self, site_pkg: Path) -> None:
"""Copy the pure-Python PIL shim into site-packages.
"""Copy all pure-Python shim packages from patches/ into site-packages.

Replaces Pillow's C extension with lightweight header-only
parsing that python-pptx needs for image handling.
Each first-level directory under patches/ (PIL, numpy, pandas, etc.)
is treated as a shim package that replaces a C extension dependency
with a lightweight pure-Python stub.
"""
pil_src = self.repo_root / "patches" / "PIL"
pil_dst = site_pkg / "PIL"
if not pil_src.is_dir():
log.warning("patches/PIL not found; skipping PIL shim installation")
patches_dir = self.repo_root / "patches"
if not patches_dir.is_dir():
log.warning("patches/ not found; skipping shim installation")
return
if pil_dst.exists():
shutil.rmtree(pil_dst)
shutil.copytree(pil_src, pil_dst)
log.info(f"installed PIL shim into {pil_dst}")
for shim_src in sorted(patches_dir.iterdir()):
if not shim_src.is_dir() or shim_src.name.startswith("."):
continue
shim_dst = site_pkg / shim_src.name
if shim_dst.exists():
shutil.rmtree(shim_dst)
shutil.copytree(shim_src, shim_dst)
log.info(f"installed {shim_src.name} shim into {shim_dst}")
Comment thread .nanvix/z.py
Comment on lines +690 to +705
# Write Python shim with fallback for when numpy shim lacks buffer protocol
(dst / "query_integral_image.py").write_text(
'''"""Bridge to native C extension with pure-Python fallback."""
try:
from _wc_query_integral_image import query_integral_image as _c_impl
except ImportError:
_c_impl = None


def query_integral_image(integral_image, size_x, size_y, random_state):
"""Query integral image for free rectangles.

Uses the native C extension when numpy arrays support the buffer
protocol (i.e. real numpy). Falls back to pure Python otherwise.
"""
if _c_impl is not None:
Comment thread .nanvix/z.py Outdated
Comment on lines +797 to +799
# Remove the pure-Python numpy shim if it exists (replaced by native)
patches_np = site_pkg / "numpy"
# The bridge shim should already be in place from the buildroot copy
@@ -0,0 +1 @@
print("bench_hello: PASS")
Comment on lines +10 to +12
# Basic import verification
assert hasattr(pdfminer, '__version__') or True # pdfminer.six may not set __version__

Comment thread patches/PIL/ImageFont.py
Comment on lines 13 to +31
@@ -22,8 +23,12 @@ def getbbox(self, text, mode="", direction="", features=None, language=None, anc
return (0, 0, int(w), int(self.size * 1.2))

def getsize(self, text, *args, **kwargs):
w = self.getlength(text)
return (int(w), int(self.size * 1.2))
w = int(self.getlength(text))
h = int(self.size * 1.2)
return (w, h), (0, 0)

def getmetrics(self):
return (int(self.size * 0.8), int(self.size * 0.2))
Comment thread patches/numpy/__init__.py
Comment on lines +655 to +671
def where(condition, x=None, y=None):
condition = asarray(condition)
if x is None and y is None:
indices = [i for i, v in enumerate(condition._data) if v]
return (array(indices),)
x = asarray(x)
y = asarray(y)
result = ndarray(condition.shape, x.dtype)
result._data = [xv if c else yv for c, xv, yv in zip(condition._data, x._data, y._data)]
return result

def concatenate(arrays, axis=0):
data = []
for a in arrays:
a = asarray(a)
data.extend(a._data)
return array(data)
Comment thread patches/scipy/stats.py
Comment on lines +103 to +107
def mode(a, axis=0):
class _Mode:
mode = a[0] if a else 0
count = 1
return _Mode()
Comment on lines +42 to +51
try:
data = base64.urlsafe_b64decode(token)
except Exception:
raise InvalidToken("Invalid base64")
if not data or data[0:1] != b"\x80":
raise InvalidToken("Invalid version")
# Extract payload (simplified)
payload = data[1:-32]
if len(payload) < 24:
raise InvalidToken("Token too short")
… packages

Integrate cross-compiled C extensions (numpy, pandas, Pillow, rapidfuzz,
wordcloud) into the Nanvix Python build system and add comprehensive
functional tests for all 23 target packages.

Build system changes (.nanvix/z.py):
- Add _install_pandas_python() to copy pandas source into sysroot
- Invoke pandas/numpy Python file installation during build and test
- Update .gitignore for build artifacts

Patches for packages requiring runtime fixes:
- numpy/__init__.py: meta-path finder for flat builtins
- pandas/__init__.py: meta-path finder resolving Cython circular imports
- pandas/api/, pandas/core/: compatibility shims for 3.x source
- PIL/Image.py, ImageFont.py, ImageDraw.py, ImageFilter.py: Nanvix compat
- matplotlib/, scipy/, cryptography/, psutil/, pypdfium2/: stub modules

Functional tests (tests/func/):
- test_118 through test_140: import/smoke tests for all 23 packages
- test_150-154: benchmarks (hello, numpy import/compute, pandas import/compute)

Requirements:
- Updated site-packages-extra.txt with all pure-Python dependencies

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@ppenna ppenna force-pushed the feature/native-extensions branch from 18c8c90 to c67324a Compare May 14, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants