Skip to content

chore: upgrade to Python 3.14 and modernize dependencies#1103

Open
PascalRepond wants to merge 1 commit intorero:stagingfrom
PascalRepond:rep-py314
Open

chore: upgrade to Python 3.14 and modernize dependencies#1103
PascalRepond wants to merge 1 commit intorero:stagingfrom
PascalRepond:rep-py314

Conversation

@PascalRepond
Copy link
Copy Markdown
Contributor

@PascalRepond PascalRepond commented Apr 28, 2026

  • Bump runtime to Python 3.14 (Docker base, CI, INSTALL, pyproject) and Node to 24
  • Loosen Invenio upper bounds and refresh uv.lock
  • Replace pysftp/paramiko<4 with sftpretty
  • Drop pytz in favor of stdlib zoneinfo / datetime.UTC
  • Replace deprecated datetime.utcnow() calls
  • Remove now-unneeded shims (setuptools<82, mock, appnope)
  • Add a CI concurrency group to cancel superseded runs
  • Refresh pip-audit exception list
  • Drop the obsolete version key from docker-compose.full.yml
  • Add a CLAUDE.md guide for AI-assisted development
  • Bump external services (PostgreSQL 17, RabbitMQ 4.3, Grobid 0.9.0)

@PascalRepond PascalRepond force-pushed the rep-py314 branch 2 times, most recently from e808fc4 to 86766b3 Compare April 28, 2026 11:42
@PascalRepond PascalRepond marked this pull request as ready for review April 28, 2026 11:53
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 28, 2026

Walkthrough

This PR updates the SONAR project to Python 3.14 and Node.js 24, upgrades CI actions and Docker base images accordingly, and modernizes datetime handling by removing the pytz dependency in favor of Python's standard library datetime.UTC and zoneinfo.ZoneInfo. It adds a new runtime patch module that wraps Marshmallow schema serialization methods with context and result handling, enhances Flask-Security integration with Flask-Principal role-based needs, switches FTP functionality from pysftp to sftpretty, significantly updates pyproject.toml dependencies, and includes a new developer guide (CLAUDE.md).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Additional Notes

Risk areas requiring careful attention:

  • pyproject.toml: Multiple dependency constraint changes with removed upper bounds and bumped minimum versions across Invenio packages; verify no version conflicts or incompatibilities arise
  • sonar/patches.py: New monkey-patching of Marshmallow BaseSchema methods with context wrapping; requires understanding of potential side effects and interaction with existing serialization flows
  • Exception handling syntax: Files like sonar/modules/shibboleth_authenticator/views/client.py change from except (ValueError, BadData): to except ValueError, BadData: — verify this syntax is compatible with the target Python 3.14
  • Timezone migration: Widespread replacement of pytz with datetime.UTC and zoneinfo.ZoneInfo; confirm all conversion patterns preserve intended behavior, especially in serialization and localization contexts
  • sonar/ext.py: New signal handler registration on identity_loaded event could affect authentication/authorization flows — test with various user role configurations
  • FTP dependency swap: Change from pysftp to sftpretty may have API or behavior differences; verify SNLRepository integration still functions correctly
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main objectives: upgrading Python to 3.14 and modernizing dependencies. It accurately reflects the primary changes across the entire changeset.
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description is highly specific and directly related to the changeset, detailing version upgrades, dependency modernization, and infrastructure updates.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
sonar/patches.py (2)

15-48: Add focused regression tests for this monkey patch.

This runtime patch changes core dump/load behavior; please add tests for nested schema calls with and without context, ensuring parent context is preserved after child serialization.

Based on learnings: Follow test-driven development methodology with tests accompanying each commit ensuring functionality works as intended.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sonar/patches.py` around lines 15 - 48, Add focused regression tests that
exercise the _wrap monkey-patch around Schema methods to ensure parent context
is preserved when nested schemas call dump/load with and without an explicit
context: write tests that call BaseSchema.dump/BaseSchema.dumps and
BaseSchema.load/BaseSchema.loads via nested Schema instances (using Schema.dump
and Schema.load behaviors) where a parent schema sets context_schema keys, a
Nested child calls schema.dump() internally both with context=None and with an
explicit context, and assert that after child serialization the parent's
context_schema values remain unchanged; include cases that trigger the
LookupError branch and the context override branch, and verify result_wrapper
behavior is preserved.

27-29: Use explicit None handling instead of or {} for context.

kwargs.pop("context") or {} also rewrites valid falsy values (e.g., an empty custom mapping). Prefer only defaulting on None.

♻️ Proposed change
-        if "context" in kwargs:
-            context = kwargs.pop("context") or {}
+        if "context" in kwargs:
+            context = kwargs.pop("context")
+            if context is None:
+                context = {}
             token = context_schema.set(context)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sonar/patches.py` around lines 27 - 29, Replace the current null-coalescing
pattern kwargs.pop("context") or {} with explicit None handling: assign the
popped value to a temporary (e.g., ctx = kwargs.pop("context")), then set
context = {} only if ctx is None (otherwise keep ctx, even if falsy), and pass
that context into context_schema.set(context); update the lines referencing
kwargs.pop("context") and context_schema.set(context) accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CLAUDE.md`:
- Around line 5-8: Documentation mismatch: CLAUDE.md lists "Elasticsearch 7" but
INSTALL.md still instructs contributors that "docker compose up" starts
Elasticsearch 6, causing version confusion; pick the intended target (preferably
Elasticsearch 7), update the version string wherever it appears (the
"Elasticsearch 7" phrase in CLAUDE.md or the "docker compose up" /
docker-compose service image tags referenced in INSTALL.md) so both files match,
and if switching INSTALL.md to 7 update the docker-compose service image tag(s)
and any mapping/client compatibility notes referenced in INSTALL.md to use the
Elasticsearch 7 image and settings.

In `@INSTALL.md`:
- Around line 38-42: Remove the duplicated "cd sonar" in the setup flow: ensure
the initial snippet that runs the installer and enters the repo (the block with
"curl -LsSf ... | sh" and "cd sonar") is the only place that changes directory,
and delete or change the later bootstrap section's redundant "cd sonar" so
readers are not instructed to enter sonar twice; update the bootstrap
instructions to assume the user is already in the sonar directory or add a
comment clarifying when to run "cd sonar" if you choose to keep it there.

In `@pyproject.toml`:
- Around line 80-81: Remove the restrictive urllib3 pin "urllib3<2.0.0" and
instead require a fixed-safe range that includes the security fixes (e.g.,
"urllib3>=2.6.3,<3.0.0") so the five CVEs are not suppressed; if the
Elasticsearch client version (elasticsearch 7.13.4) prevents upgrading urllib3,
update the Elasticsearch client dependency to a maintained version that supports
urllib3 >=2.6.3, and remove the suppressions in scripts/test (lines referencing
urllib3 suppressions) once the dependency is upgraded.

In `@sonar/jsonschemas/json_schema_base.py`:
- Line 62: The except clause in json_schema_base.py currently uses Python 2
syntax ("except JSONSchemaNotFound, AttributeError:"); change it to valid Python
3 syntax by handling multiple exceptions as a tuple (e.g., "except
(JSONSchemaNotFound, AttributeError):") or split into separate except blocks;
update the except that wraps the JSON schema loading logic (refer to the
JSONSchemaNotFound symbol and the surrounding try/except in json_schema_base.py)
so the module imports under Python 3.14 without syntax errors.

In `@sonar/modules/shibboleth_authenticator/views/client.py`:
- Line 117: Update the except clause that uses Python 2 syntax to a Python
3-compatible tuple form: replace the wrong "except ValueError, BadData:" usage
with an exception tuple handling both ValueError and BadData (i.e., except
(ValueError, BadData):) in the exception block in client.py so the Shibboleth
callback code (the try/except around the parsing/validation logic that
references BadData) will parse under Python 3 while preserving the existing
error handling behavior.

---

Nitpick comments:
In `@sonar/patches.py`:
- Around line 15-48: Add focused regression tests that exercise the _wrap
monkey-patch around Schema methods to ensure parent context is preserved when
nested schemas call dump/load with and without an explicit context: write tests
that call BaseSchema.dump/BaseSchema.dumps and BaseSchema.load/BaseSchema.loads
via nested Schema instances (using Schema.dump and Schema.load behaviors) where
a parent schema sets context_schema keys, a Nested child calls schema.dump()
internally both with context=None and with an explicit context, and assert that
after child serialization the parent's context_schema values remain unchanged;
include cases that trigger the LookupError branch and the context override
branch, and verify result_wrapper behavior is preserved.
- Around line 27-29: Replace the current null-coalescing pattern
kwargs.pop("context") or {} with explicit None handling: assign the popped value
to a temporary (e.g., ctx = kwargs.pop("context")), then set context = {} only
if ctx is None (otherwise keep ctx, even if falsy), and pass that context into
context_schema.set(context); update the lines referencing kwargs.pop("context")
and context_schema.set(context) accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a3e88095-c574-458b-af3f-d2fe3725d3db

📥 Commits

Reviewing files that changed from the base of the PR and between fa3bacf and 86766b3.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (21)
  • .github/workflows/continuous-integration-test.yml
  • .gitignore
  • CLAUDE.md
  • Dockerfile.base
  • INSTALL.md
  • docker-compose.full.yml
  • pyproject.toml
  • scripts/test
  • sonar/__init__.py
  • sonar/ext.py
  • sonar/jsonschemas/json_schema_base.py
  • sonar/modules/deposits/api.py
  • sonar/modules/documents/dumpers.py
  • sonar/modules/documents/urn.py
  • sonar/modules/shibboleth_authenticator/views/client.py
  • sonar/modules/validation/api.py
  • sonar/patches.py
  • sonar/snl/ftp/__init__.py
  • sonar/stats_event_builders.py
  • sonar/theme/views.py
  • tests/ui/test_views.py
💤 Files with no reviewable changes (1)
  • docker-compose.full.yml

Comment thread CLAUDE.md
Comment thread INSTALL.md
Comment thread pyproject.toml
Comment thread sonar/jsonschemas/json_schema_base.py
Comment thread sonar/modules/shibboleth_authenticator/views/client.py
@PascalRepond
Copy link
Copy Markdown
Contributor Author

Waiting for a bugfix in invenio-rest for the tests to pass: inveniosoftware/invenio-rest#157

- Bump runtime to Python 3.14 (Docker base, CI, INSTALL, pyproject)
  and Node to 24
- Loosen Invenio upper bounds and refresh uv.lock
- Replace `pysftp`/`paramiko<4` with `sftpretty`
- Drop `pytz` in favor of stdlib `zoneinfo` / `datetime.UTC`
- Replace deprecated `datetime.utcnow()` calls
- Remove now-unneeded shims (`setuptools<82`, `mock`, `appnope`)
- Add a CI concurrency group to cancel superseded runs
- Refresh pip-audit exception list
- Drop the obsolete `version` key from `docker-compose.full.yml`
- Add a CLAUDE.md guide for AI-assisted development
- Bump external services (PostgreSQL 17, RabbitMQ 4.3, Grobid 0.9.0)

Co-Authored-by: Pascal Repond <pascal.repond@rero.ch>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant