Skip to content

feat(worker): ingest_callback contract for API-triggered sync (INT-312)#432

Draft
ldrozdz93 wants to merge 19 commits into
developfrom
int-312/worker-ingest-callback
Draft

feat(worker): ingest_callback contract for API-triggered sync (INT-312)#432
ldrozdz93 wants to merge 19 commits into
developfrom
int-312/worker-ingest-callback

Conversation

@ldrozdz93

@ldrozdz93 ldrozdz93 commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Worker-side support for the API-triggered-sync contract (INT-312). Backends expose their
identity via a describe() classmethod, and the PolicyRunner constructs each backend
with an ingest_callback so an integration can ship entities outside the scheduled
run() cycle (e.g. from an HTTP-triggered sync). A worker.exceptions hierarchy
classifies ingest failures.

Changes

  • Backend.describe() — new classmethod returning the backend's Metadata without
    constructing an instance. Backends that only implement the legacy instance setup()
    keep working: the worker reads their metadata from a throwaway instance.
  • Backend.setup() is formally deprecated (fallback removal planned for v2.0), with a
    signal per audience: PEP 702 @deprecated for IDEs/type checkers, a class-definition
    DeprecationWarning when a subclass overrides setup() without describe(), and an
    operator-facing log line when the worker uses the fallback.
  • Backend.__init__ accepts ingest_callback.
  • **kwargs doors on Backend.__init__ and Backend.run — passive forward-compat:
    the worker passes nothing through them today; future minor releases may add per-tick
    context (e.g. source, run_id) without a coordinated upgrade.
  • worker.exceptions: IngestError base, with IngestRejected (permanent — do not
    retry) and IngestUnavailable (transient — retry-friendly).
  • Zero-entity runs complete as no-ops (entity_count=0) instead of failing on
    Diode's empty-batch rejection — an incremental sync with no changes is a valid outcome.
  • PolicyRunner:
    • reads the backend's metadata first (describe(), or the legacy setup() fallback),
      builds the Diode client and the ingest callback from it, then constructs the backend
      once with the callback — every dependency the callback uses is ready before the
      instance exists, so the callback is valid from construction;
    • the callback validates exactly-one-of entities/error and maps unexpected
      exceptions to the base IngestError.
  • worker/pyproject.toml source project.version raised from 1.0.0 to
    1.99.0.dev0. Editable / pip install -e consumers (e.g. controller-integrations'
    e2e harness running with --use orb-agent --local-orb-worker) read this value
    directly and fail with ResolutionImpossible against downstream pins like
    netboxlabs-orb-worker>=1.3.0. The release workflow stamps the real version on top
    of project.version before python -m build, so the published wheel is unaffected.

Consumer

The downstream @with_trigger_api decorator (netboxlabs/controller-integrations,
trigger-api-util) calls self.ingest_callback(entities=...) from an HTTP-triggered run;
the integrations' run() methods already declare **kwargs.

Test summary

Unit suite extended around the new contract: describe()-first construction and the legacy
setup() fallback, callback validation and failure paths, SDK-delegated chunking, and the
empty-delta no-op on both run paths.

ldrozdz93 and others added 6 commits May 19, 2026 14:51
Introduces IngestError / IngestUnavailable / IngestRejected to be raised
by the worker-supplied ingest callback when pipeline writes fail. The
hierarchy is the contract used by integrations to differentiate transient
vs. permanent failures.

Refs INT-386, INT-312
See: https://github.com/netboxlabs/controller-integrations/blob/develop/docs/plans/worker-backend-ingest-callback.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends Backend.__init__ with keyword-only ingest_callback, policy, plus
open **kwargs (forward-compat door). PolicyRunner.setup now builds a
per-policy ingest callback closure and passes both to backend
construction. The closure chunks+ingests entities (mirroring run()),
tracks each off-schedule emission as a pseudo-run in RunStore, and
translates response/transport errors into the new IngestError subclasses.

run() is unchanged and remains single-shot per cron tick.

Refs INT-386, INT-312
See:
https://github.com/netboxlabs/controller-integrations/blob/develop/docs/plans/worker-backend-ingest-callback.md
https://github.com/netboxlabs/controller-integrations/blob/develop/docs/plans/policy-distribution-refinement.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Bind IngestError in `except ... as exc` instead of sys.exc_info() and
  drop the now-unused `import sys`.
- Log unexpected exceptions via `logger.exception` before translating
  them to IngestUnavailable, so a worker-side programming bug isn't
  silently masked as a transient failure.
- Annotate the closure signature (`error: Exception | None = None`,
  `-> None`) and rename `**kw` → `**kwargs` to match the docstring and
  Backend.__init__ contract. Comment the forward-compat door.
- Guard against an integration calling the callback before the diode
  client is initialised (e.g. from inside `Backend.setup()`): record a
  FAILED pseudo-run and raise IngestUnavailable with a clear message
  instead of letting an AttributeError surface as an unexpected
  exception.
- Extract `_send_entities` so `_build_ingest_callback` stays under the
  C901 complexity limit.
- Add a unit test for the new guard.

Refs INT-386, INT-312

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…k helper

Addresses 6 reviewer comments on PR #415:

- Init-order race (Copilot/Codex/jajeffries): the closure dereferences
  `self.run_store`, `self.metadata`, `self._diode_client` — none of
  which are attached until after `backend.setup()` returns. Replaces
  the partial `_diode_client is None` guard with a single
  `self._callback_ready` flag, flipped True at the end of `setup()`
  and back to False at the start of `stop()`. Single source of truth
  for "callback is safe to invoke".
- Pre-try work (Copilot): `list(entities)` and
  `apply_run_id_to_entities` now run INSIDE the try, so failures
  there reliably mark the pseudo-run FAILED instead of leaving it
  RUNNING. `entities_list = []` is pre-bound so `len()` is defined in
  the except handlers when `list(entities)` itself fails.
- Chunk-threshold constant (Copilot): extracts
  `MAX_INGEST_MESSAGE_BYTES = 3 * 1024 * 1024` at module scope;
  replaces both float literals in `_send_entities` and `run()`.
- Helper consistency (leoparente): `run()` now calls `_send_entities`
  instead of inlining the chunking branch. Helper returns the chunk
  count for the existing log line. Side effect: `run()` now raises
  `IngestRejected` (not `RuntimeError`) on response errors; outer
  `except Exception` catches it unchanged.

Backend.__init__ docstring documents the "do not invoke from
__init__/setup()" contract.

Adds 4 new tests, renames one existing test to match the new guard:
- test_ingest_callback_raises_when_not_ready (renamed)
- test_ingest_callback_records_failure_on_apply_run_id_error
- test_ingest_callback_records_failure_on_iterable_error
- test_setup_sets_callback_ready_flag
- test_stop_clears_callback_ready_flag

ruff clean, pytest 114 passed.

Refs INT-386, INT-312

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…spect on construction

- ADR-0008: Drop policy= kwarg from Backend.__init__ (no consumer in v1).
- ADR-0009: Add **kwargs forward-compat door to Backend.run.
- ADR-0007: PolicyRunner uses inspect.signature to detect whether the
  backend class accepts ingest_callback. Legacy backends with __init__(self)
  continue to construct zero-arg with no coordinated upgrade.

49 tests pass including 3 new parametrized introspection cases.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…setup callback, shared _execute_run, IngestError mapping)

- Delete _construct_backend signature introspection; construct backends
  directly (Backend.__init__ already accepts ingest_callback + **kwargs).
- Build + attach ingest_callback AFTER deps are assigned; drop the
  _callback_ready flag, its early-call guard, and the stop() reset.
- Extract _execute_run(client, produce_entities, *, source) shared by the
  scheduled run() and the ingest callback. The entity producer is invoked
  lazily INSIDE the run's try (after create_run), so a crashing eager
  backend is still recorded as a FAILED run rather than vanishing.
- Map the callback's unexpected-exception catch-all to base IngestError
  (not the retry-friendly IngestUnavailable). IngestUnavailable is retained
  in worker.exceptions (published contract consumed downstream).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Coverage

Coverage Report
FileStmtsMissCoverMissing
worker
   entity_metadata.py13192%23
   main.py63494%206–207, 217, 223
   metrics.py52198%102
   package_finder.py891188%24–25, 48, 96–97, 132, 145–146, 159, 183–184
   server.py81495%34–36, 167, 170
   version.py7186%14
worker/policy
   manager.py49394%35–36, 52
   runner.py135199%145
TOTAL6202696% 

Tests Skipped Failures Errors Time
119 0 💤 0 ❌ 0 🔥 2.026s ⏱️

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds worker-side support for an API-triggered ingest contract by introducing an ingest_callback hook on backends, an exception hierarchy for ingest failures, and refactoring PolicyRunner to reuse a shared run+ingest path for both scheduled runs and callback-triggered ingests.

Changes:

  • Add worker.exceptions (IngestError, IngestRejected, IngestUnavailable) and unit tests for the hierarchy.
  • Extend Backend to accept/store an optional ingest_callback plus forward-compat **kwargs (and allow run(**kwargs)).
  • Refactor PolicyRunner to (a) attach an ingest callback after setup dependencies exist and (b) centralize run creation, ingest chunking, and run status updates in _execute_run() / _send_entities(); add tests for the new callback behavior.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
worker/worker/policy/runner.py Adds ingest callback construction and shared run execution/ingest helpers; refactors scheduled run() to use them.
worker/worker/exceptions.py Introduces ingestion-related exception hierarchy for consumers to classify failures.
worker/worker/backend.py Adds ingest_callback + **kwargs forward-compat “doors” on backend construction and run().
worker/tests/test_exceptions.py Adds tests validating exception hierarchy and messaging.
worker/tests/test_backend.py Adds tests for new Backend.__init__ and Backend.run(**kwargs) behavior.
worker/tests/policy/test_runner.py Adds regression + new tests for callback ingestion behavior and refactored run semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread worker/worker/policy/runner.py
Comment thread worker/worker/policy/runner.py
Comment thread worker/worker/backend.py Outdated
Comment thread worker/worker/exceptions.py
Comment thread worker/tests/policy/test_runner.py Outdated
- _build_ingest_callback: drop the unused policy_name parameter (it was always
  self.name; internals already use self.name), removing the attribution
  inconsistency the reviewer flagged.
- backend.py: docstring now states the worker constructs with no args and
  assigns ingest_callback afterwards (matching PolicyRunner.setup).
- exceptions.py: broaden the module docstring — the hierarchy is raised by
  worker ingestion in general (scheduled run() path + ingest callback), not
  just the callback.
- test_runner.py: fix the multi-chunk test comment to match the scenario
  (both chunks sent; the second chunk's response carries errors).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread worker/tests/policy/test_runner.py
Comment thread worker/tests/policy/test_runner.py
Comment thread worker/worker/policy/runner.py Outdated
Comment thread worker/worker/policy/runner.py Outdated
ldrozdz93 and others added 5 commits June 2, 2026 15:40
…kend once with ingest_callback

Backends can expose their identity via a `describe()` classmethod, so the
worker reads metadata (name/app_name/app_version) WITHOUT constructing an
instance, builds the ingest callback, and constructs the backend once with
the callback already in hand — removing the post-construction attach step.

Backends that only implement the instance `setup()` are still supported: the
worker falls back to a throwaway instance to read their metadata, then builds
the real callback-bearing instance.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…struction

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…GE_BYTES

_send_entities now always calls create_message_chunks, which owns the gRPC
message-size threshold (max_chunk_size_mb=3.0, a safe margin below the 4 MB
ceiling) and returns a single chunk when the payload already fits. This
removes the worker's duplicate of that constant and the
estimate_message_size pre-check.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- @deprecated (PEP 702, typing_extensions) on Backend.setup() so IDEs and
  type checkers flag uses; calling the base method also warns at runtime.
- Backend.__init_subclass__ emits a DeprecationWarning at class-definition
  time when a subclass overrides setup() without implementing describe() —
  the signal integration authors see in their own test runs.
- PolicyRunner._backend_metadata logs an operator-facing warning when the
  legacy throwaway-setup() fallback is used, naming the backend class.
- typing-extensions declared as a runtime dependency.

The fallback (and these warnings) are scheduled for removal in worker v2.0.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…resolve

Source `project.version` was `1.0.0`. CI overwrites it at release time
(worker-release.yaml runs `toml set --toml-path pyproject.toml
project.version` and stamps `worker/version.py` separately), so the
published wheel is unaffected.

The placeholder only matters for **editable / local installs from
source** — e.g. controller-integrations' `e2e_tests --use orb-agent
--local-orb-worker`, which `pip install -e`s this checkout. pip reads
`project.version` from source (`1.0.0`) and resolves it against
backend-util's pin `netboxlabs-orb-worker>=1.3.0`; `1.0.0 < 1.3.0`
triggers `ResolutionImpossible` and the install fails.

Bumping the placeholder to `1.99.0.dev0` satisfies any reasonable
`>=X.Y` floor (PEP 440 ordering: `1.99.0.dev0 >= 1.3.0` is true) while
remaining an obvious not-a-real-release marker. The build still stamps
the true release version on top of it, so nothing downstream of a real
build changes.
Comment thread worker/tests/policy/test_runner.py
ldrozdz93 and others added 2 commits June 10, 2026 17:11
…ducer only

The debug line inherited develop's wording but, after the _execute_run
consolidation, its timer wrapped run creation, chunking, client ingest and
run-store writes too. _execute_run now measures exactly the producer call
(list(produce_entities())) and reports it via a _RunOutcome NamedTuple, so
run() logs a truthful number again. The backend_execution_latency metric is
unchanged — it covered the full execution on develop and still does.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The 2026-06-10 GitHub incident failed the CodeQL upload and coverage-comment
steps on the previous push; CodeQL default-setup runs cannot be re-run via
the API, so an empty commit re-fires them.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread worker/tests/policy/test_runner.py Outdated
@ldrozdz93 ldrozdz93 marked this pull request as ready for review June 11, 2026 05:05
ldrozdz93 and others added 5 commits June 11, 2026 07:11
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d.run

Activates the forward-compat door Backend.run already declares: when a
backend's run() signature accepts **kwargs, the worker passes
source="scheduled" and the run-store run_id (the same id stamped on the
produced entities). Legacy two-argument signatures are detected via
inspect and keep getting the bare call, with an operator-facing warning —
the same compat posture as the describe()/setup() fallback. Downstream,
controller-integrations' backends and the @with_trigger_api wrapper
already declare and forward **kwargs, so the context rides in without a
coordinated upgrade.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Diode's ingester rejects an empty batch ("entities is empty"), so a run
that legitimately produces zero entities — e.g. an incremental sync with
no changes since its watermark — was recorded as FAILED. _send_entities
now returns without calling client.ingest when the list is empty, so both
the scheduled path and the ingest-callback path record a clean COMPLETED
run with entity_count=0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…o Backend.run"

This reverts commit ac389b2deccd9837e9012575864ab48148889a39.

Passing source/run_id into Backend.run is unrelated to this PR's scope;
the **kwargs forward-compat door on run() stays, passive, for the future.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@leoparente

Copy link
Copy Markdown
Contributor

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: db8c0b2273

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

# Every dependency the closure reads (run_store, metadata, _diode_client)
# is now assigned, so the callback is built first and the backend is
# constructed ONCE with it — no post-construction attach step.
backend = backend_class(ingest_callback=self._build_ingest_callback())

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Run setup on the backend instance that is scheduled

When a legacy backend only implements setup(), _backend_metadata() gets metadata by calling backend_class().setup(), but this line schedules a separate freshly constructed instance whose setup() was never invoked. Before this change, the same instance was constructed, set up, and later run, so any existing backend that initializes instance fields or connections in setup() and reads them in run() will now fail or run uninitialized despite the fallback claiming legacy support. Please reuse the setup-bearing instance for legacy backends or call setup() on the scheduled instance.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this is important /\

@leoparente

Copy link
Copy Markdown
Contributor

I would add a nbl_custom_legacy in tests to be sure that it still works

@ldrozdz93 ldrozdz93 requested a review from leoparente June 12, 2026 03:54
@ldrozdz93 ldrozdz93 marked this pull request as draft June 12, 2026 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants