feat(worker): ingest_callback contract for API-triggered sync (INT-312)#432
feat(worker): ingest_callback contract for API-triggered sync (INT-312)#432ldrozdz93 wants to merge 19 commits into
Conversation
Introduces IngestError / IngestUnavailable / IngestRejected to be raised by the worker-supplied ingest callback when pipeline writes fail. The hierarchy is the contract used by integrations to differentiate transient vs. permanent failures. Refs INT-386, INT-312 See: https://github.com/netboxlabs/controller-integrations/blob/develop/docs/plans/worker-backend-ingest-callback.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends Backend.__init__ with keyword-only ingest_callback, policy, plus open **kwargs (forward-compat door). PolicyRunner.setup now builds a per-policy ingest callback closure and passes both to backend construction. The closure chunks+ingests entities (mirroring run()), tracks each off-schedule emission as a pseudo-run in RunStore, and translates response/transport errors into the new IngestError subclasses. run() is unchanged and remains single-shot per cron tick. Refs INT-386, INT-312 See: https://github.com/netboxlabs/controller-integrations/blob/develop/docs/plans/worker-backend-ingest-callback.md https://github.com/netboxlabs/controller-integrations/blob/develop/docs/plans/policy-distribution-refinement.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Bind IngestError in `except ... as exc` instead of sys.exc_info() and drop the now-unused `import sys`. - Log unexpected exceptions via `logger.exception` before translating them to IngestUnavailable, so a worker-side programming bug isn't silently masked as a transient failure. - Annotate the closure signature (`error: Exception | None = None`, `-> None`) and rename `**kw` → `**kwargs` to match the docstring and Backend.__init__ contract. Comment the forward-compat door. - Guard against an integration calling the callback before the diode client is initialised (e.g. from inside `Backend.setup()`): record a FAILED pseudo-run and raise IngestUnavailable with a clear message instead of letting an AttributeError surface as an unexpected exception. - Extract `_send_entities` so `_build_ingest_callback` stays under the C901 complexity limit. - Add a unit test for the new guard. Refs INT-386, INT-312 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…k helper Addresses 6 reviewer comments on PR #415: - Init-order race (Copilot/Codex/jajeffries): the closure dereferences `self.run_store`, `self.metadata`, `self._diode_client` — none of which are attached until after `backend.setup()` returns. Replaces the partial `_diode_client is None` guard with a single `self._callback_ready` flag, flipped True at the end of `setup()` and back to False at the start of `stop()`. Single source of truth for "callback is safe to invoke". - Pre-try work (Copilot): `list(entities)` and `apply_run_id_to_entities` now run INSIDE the try, so failures there reliably mark the pseudo-run FAILED instead of leaving it RUNNING. `entities_list = []` is pre-bound so `len()` is defined in the except handlers when `list(entities)` itself fails. - Chunk-threshold constant (Copilot): extracts `MAX_INGEST_MESSAGE_BYTES = 3 * 1024 * 1024` at module scope; replaces both float literals in `_send_entities` and `run()`. - Helper consistency (leoparente): `run()` now calls `_send_entities` instead of inlining the chunking branch. Helper returns the chunk count for the existing log line. Side effect: `run()` now raises `IngestRejected` (not `RuntimeError`) on response errors; outer `except Exception` catches it unchanged. Backend.__init__ docstring documents the "do not invoke from __init__/setup()" contract. Adds 4 new tests, renames one existing test to match the new guard: - test_ingest_callback_raises_when_not_ready (renamed) - test_ingest_callback_records_failure_on_apply_run_id_error - test_ingest_callback_records_failure_on_iterable_error - test_setup_sets_callback_ready_flag - test_stop_clears_callback_ready_flag ruff clean, pytest 114 passed. Refs INT-386, INT-312 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…spect on construction - ADR-0008: Drop policy= kwarg from Backend.__init__ (no consumer in v1). - ADR-0009: Add **kwargs forward-compat door to Backend.run. - ADR-0007: PolicyRunner uses inspect.signature to detect whether the backend class accepts ingest_callback. Legacy backends with __init__(self) continue to construct zero-arg with no coordinated upgrade. 49 tests pass including 3 new parametrized introspection cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…setup callback, shared _execute_run, IngestError mapping) - Delete _construct_backend signature introspection; construct backends directly (Backend.__init__ already accepts ingest_callback + **kwargs). - Build + attach ingest_callback AFTER deps are assigned; drop the _callback_ready flag, its early-call guard, and the stop() reset. - Extract _execute_run(client, produce_entities, *, source) shared by the scheduled run() and the ingest callback. The entity producer is invoked lazily INSIDE the run's try (after create_run), so a crashing eager backend is still recorded as a FAILED run rather than vanishing. - Map the callback's unexpected-exception catch-all to base IngestError (not the retry-friendly IngestUnavailable). IngestUnavailable is retained in worker.exceptions (published contract consumed downstream). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Coverage Report
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Pull request overview
This PR adds worker-side support for an API-triggered ingest contract by introducing an ingest_callback hook on backends, an exception hierarchy for ingest failures, and refactoring PolicyRunner to reuse a shared run+ingest path for both scheduled runs and callback-triggered ingests.
Changes:
- Add
worker.exceptions(IngestError,IngestRejected,IngestUnavailable) and unit tests for the hierarchy. - Extend
Backendto accept/store an optionalingest_callbackplus forward-compat**kwargs(and allowrun(**kwargs)). - Refactor
PolicyRunnerto (a) attach an ingest callback after setup dependencies exist and (b) centralize run creation, ingest chunking, and run status updates in_execute_run()/_send_entities(); add tests for the new callback behavior.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
worker/worker/policy/runner.py |
Adds ingest callback construction and shared run execution/ingest helpers; refactors scheduled run() to use them. |
worker/worker/exceptions.py |
Introduces ingestion-related exception hierarchy for consumers to classify failures. |
worker/worker/backend.py |
Adds ingest_callback + **kwargs forward-compat “doors” on backend construction and run(). |
worker/tests/test_exceptions.py |
Adds tests validating exception hierarchy and messaging. |
worker/tests/test_backend.py |
Adds tests for new Backend.__init__ and Backend.run(**kwargs) behavior. |
worker/tests/policy/test_runner.py |
Adds regression + new tests for callback ingestion behavior and refactored run semantics. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- _build_ingest_callback: drop the unused policy_name parameter (it was always self.name; internals already use self.name), removing the attribution inconsistency the reviewer flagged. - backend.py: docstring now states the worker constructs with no args and assigns ingest_callback afterwards (matching PolicyRunner.setup). - exceptions.py: broaden the module docstring — the hierarchy is raised by worker ingestion in general (scheduled run() path + ingest callback), not just the callback. - test_runner.py: fix the multi-chunk test comment to match the scenario (both chunks sent; the second chunk's response carries errors). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…kend once with ingest_callback Backends can expose their identity via a `describe()` classmethod, so the worker reads metadata (name/app_name/app_version) WITHOUT constructing an instance, builds the ingest callback, and constructs the backend once with the callback already in hand — removing the post-construction attach step. Backends that only implement the instance `setup()` are still supported: the worker falls back to a throwaway instance to read their metadata, then builds the real callback-bearing instance. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…struction Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…GE_BYTES _send_entities now always calls create_message_chunks, which owns the gRPC message-size threshold (max_chunk_size_mb=3.0, a safe margin below the 4 MB ceiling) and returns a single chunk when the payload already fits. This removes the worker's duplicate of that constant and the estimate_message_size pre-check. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- @deprecated (PEP 702, typing_extensions) on Backend.setup() so IDEs and type checkers flag uses; calling the base method also warns at runtime. - Backend.__init_subclass__ emits a DeprecationWarning at class-definition time when a subclass overrides setup() without implementing describe() — the signal integration authors see in their own test runs. - PolicyRunner._backend_metadata logs an operator-facing warning when the legacy throwaway-setup() fallback is used, naming the backend class. - typing-extensions declared as a runtime dependency. The fallback (and these warnings) are scheduled for removal in worker v2.0. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…resolve Source `project.version` was `1.0.0`. CI overwrites it at release time (worker-release.yaml runs `toml set --toml-path pyproject.toml project.version` and stamps `worker/version.py` separately), so the published wheel is unaffected. The placeholder only matters for **editable / local installs from source** — e.g. controller-integrations' `e2e_tests --use orb-agent --local-orb-worker`, which `pip install -e`s this checkout. pip reads `project.version` from source (`1.0.0`) and resolves it against backend-util's pin `netboxlabs-orb-worker>=1.3.0`; `1.0.0 < 1.3.0` triggers `ResolutionImpossible` and the install fails. Bumping the placeholder to `1.99.0.dev0` satisfies any reasonable `>=X.Y` floor (PEP 440 ordering: `1.99.0.dev0 >= 1.3.0` is true) while remaining an obvious not-a-real-release marker. The build still stamps the true release version on top of it, so nothing downstream of a real build changes.
…ducer only The debug line inherited develop's wording but, after the _execute_run consolidation, its timer wrapped run creation, chunking, client ingest and run-store writes too. _execute_run now measures exactly the producer call (list(produce_entities())) and reports it via a _RunOutcome NamedTuple, so run() logs a truthful number again. The backend_execution_latency metric is unchanged — it covered the full execution on develop and still does. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The 2026-06-10 GitHub incident failed the CodeQL upload and coverage-comment steps on the previous push; CodeQL default-setup runs cannot be re-run via the API, so an empty commit re-fires them. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d.run Activates the forward-compat door Backend.run already declares: when a backend's run() signature accepts **kwargs, the worker passes source="scheduled" and the run-store run_id (the same id stamped on the produced entities). Legacy two-argument signatures are detected via inspect and keep getting the bare call, with an operator-facing warning — the same compat posture as the describe()/setup() fallback. Downstream, controller-integrations' backends and the @with_trigger_api wrapper already declare and forward **kwargs, so the context rides in without a coordinated upgrade. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Diode's ingester rejects an empty batch ("entities is empty"), so a run
that legitimately produces zero entities — e.g. an incremental sync with
no changes since its watermark — was recorded as FAILED. _send_entities
now returns without calling client.ingest when the list is empty, so both
the scheduled path and the ingest-callback path record a clean COMPLETED
run with entity_count=0.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…o Backend.run" This reverts commit ac389b2deccd9837e9012575864ab48148889a39. Passing source/run_id into Backend.run is unrelated to this PR's scope; the **kwargs forward-compat door on run() stays, passive, for the future. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: db8c0b2273
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # Every dependency the closure reads (run_store, metadata, _diode_client) | ||
| # is now assigned, so the callback is built first and the backend is | ||
| # constructed ONCE with it — no post-construction attach step. | ||
| backend = backend_class(ingest_callback=self._build_ingest_callback()) |
There was a problem hiding this comment.
Run setup on the backend instance that is scheduled
When a legacy backend only implements setup(), _backend_metadata() gets metadata by calling backend_class().setup(), but this line schedules a separate freshly constructed instance whose setup() was never invoked. Before this change, the same instance was constructed, set up, and later run, so any existing backend that initializes instance fields or connections in setup() and reads them in run() will now fail or run uninitialized despite the fallback claiming legacy support. Please reuse the setup-bearing instance for legacy backends or call setup() on the scheduled instance.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
maybe this is important /\
|
I would add a nbl_custom_legacy in tests to be sure that it still works |
Summary
Worker-side support for the API-triggered-sync contract (INT-312). Backends expose their
identity via a
describe()classmethod, and thePolicyRunnerconstructs each backendwith an
ingest_callbackso an integration can ship entities outside the scheduledrun()cycle (e.g. from an HTTP-triggered sync). Aworker.exceptionshierarchyclassifies ingest failures.
Changes
Backend.describe()— new classmethod returning the backend'sMetadatawithoutconstructing an instance. Backends that only implement the legacy instance
setup()keep working: the worker reads their metadata from a throwaway instance.
Backend.setup()is formally deprecated (fallback removal planned for v2.0), with asignal per audience: PEP 702
@deprecatedfor IDEs/type checkers, a class-definitionDeprecationWarningwhen a subclass overridessetup()withoutdescribe(), and anoperator-facing log line when the worker uses the fallback.
Backend.__init__acceptsingest_callback.**kwargsdoors onBackend.__init__andBackend.run— passive forward-compat:the worker passes nothing through them today; future minor releases may add per-tick
context (e.g.
source,run_id) without a coordinated upgrade.worker.exceptions:IngestErrorbase, withIngestRejected(permanent — do notretry) and
IngestUnavailable(transient — retry-friendly).entity_count=0) instead of failing onDiode's empty-batch rejection — an incremental sync with no changes is a valid outcome.
PolicyRunner:describe(), or the legacysetup()fallback),builds the Diode client and the ingest callback from it, then constructs the backend
once with the callback — every dependency the callback uses is ready before the
instance exists, so the callback is valid from construction;
entities/errorand maps unexpectedexceptions to the base
IngestError.worker/pyproject.tomlsourceproject.versionraised from1.0.0to1.99.0.dev0. Editable /pip install -econsumers (e.g. controller-integrations'e2e harness running with
--use orb-agent --local-orb-worker) read this valuedirectly and fail with
ResolutionImpossibleagainst downstream pins likenetboxlabs-orb-worker>=1.3.0. The release workflow stamps the real version on topof
project.versionbeforepython -m build, so the published wheel is unaffected.Consumer
The downstream
@with_trigger_apidecorator (netboxlabs/controller-integrations,trigger-api-util) callsself.ingest_callback(entities=...)from an HTTP-triggered run;the integrations'
run()methods already declare**kwargs.Test summary
Unit suite extended around the new contract: describe()-first construction and the legacy
setup() fallback, callback validation and failure paths, SDK-delegated chunking, and the
empty-delta no-op on both run paths.