ENG-2894: kafka observability by jmsuzuki · Pull Request #4047 · 514-labs/moosestack

jmsuzuki · 2026-04-24T00:41:35Z

adds observability metrics to the local /metrics endpoint. includes:

moose_kafka_client_gauge{purpose}
moose_function_worker_restarts_total{reason}
moose_function_process_diff_updated_total{reason}

Note

Medium Risk
Touches Kafka client construction across multiple producer/consumer/admin code paths by wrapping handles for lifecycle tracking, which could affect resource lifetimes and shutdown behavior if misused. Otherwise changes are additive observability and event forwarding with a kill-switch for the hot-path gauge.

Overview
Adds Phase-0 connection-churn observability to the management server’s GET /metrics, introducing moose_kafka_client_gauge{purpose}, moose_function_worker_restarts_total{reason}, and moose_function_process_diff_updated_total{reason} (with new operator docs).

Kafka client creation is now purpose-tagged via a new KafkaClientHandle<T> wrapper that increments/decrements the gauge on handle clone/drop, and key call sites are updated to pass explicit PURPOSE_* constants (producers, consumers/subscribers, admin clients, health probe, topic size checks). A MOOSE_KAFKA_CLIENT_METRICS_DISABLED env kill-switch disables only the Kafka gauge tracking.

Adds restart/diff counters by (1) recording worker-exit classifications from the Rust process supervisor, and (2) logging whether function-process diffs were no_change vs forced_always; TypeScript and Python workers now POST restart reasons to /metrics-logs, and the route allow-list forwards these new events. Includes unit tests plus an end-to-end /metrics scrape test for the new series.

^{Reviewed by Cursor Bugbot for commit f4786f5. Bugbot is set up for automated code reviews on this repo. Configure here.}

adds observability metrics to the local /metrics endpoint. includes: > moose_kafka_client_gauge{purpose} > moose_function_worker_restarts_total{reason} > moose_function_process_diff_updated_total{reason}

vercel · 2026-04-24T00:41:38Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
docs-v2	Ready	Preview, Comment	Apr 24, 2026 0:43am

linear · 2026-04-24T00:41:39Z

ENG-2794 Redpanda Connection Churn is Too High

jmsuzuki · 2026-04-24T00:41:49Z

ENG-2894: kafka observability #4047 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

coderabbitai · 2026-04-24T00:41:50Z

Summary by CodeRabbit

Release Notes

New Features
- Added OpenMetrics endpoints exposing Kafka client lifecycle metrics (gauge), function worker restart counters, and process update counters.
- Worker processes now report exit reasons for enhanced observability and troubleshooting.
- Kafka client metrics can be disabled via MOOSE_KAFKA_CLIENT_METRICS_DISABLED environment variable.
Documentation
- Added operator metrics documentation specifying exposed endpoints, metric schemas, and label definitions.

Walkthrough

Adds comprehensive Kafka client lifecycle metrics instrumentation and worker exit-reason observability. Introduces purpose-tagged Kafka client tracking, global metrics handle propagation, worker exit-reason classification and HTTP reporting via /metrics-logs endpoint, and filtering of metric events from external sinks.

Changes

Cohort / File(s)	Summary
Metrics Infrastructure `src/metrics.rs`, `src/metrics_inserter.rs`	New `MetricEvent` variants for Kafka client lifecycle and function worker restarts/diffs. Prometheus gauge and counter families with label-keyed updates. Global `OnceLock`-based handle mechanism and env-gated (`MOOSE_KAFKA_CLIENT_METRICS_DISABLED`) helper functions. Metric events explicitly filtered out from external metric sink payload construction.
Kafka Client Instrumentation `src/infrastructure/stream/kafka/client.rs`, `src/infrastructure/stream/kafka/models.rs`	New `KafkaClientHandle<T>` wrapper recording create/drop lifecycle via `record_kafka_client_created/dropped` with purpose labels. Factory functions (`create_producer`, `create_consumer`, `create_subscriber`, `create_idempotent_producer`) now accept purpose parameter and return wrapped clients. Topic admin operations similarly wrap `AdminClient` instances with purpose-specific handles.
CLI & Webserver Integration `src/cli.rs`, `src/cli/local_webserver.rs`	Global metrics handle propagation in `Dev`/`Prod` paths. New `POST /metrics-logs` endpoint accepting `StreamingFunctionEvent` and `FunctionWorkerRestart` payloads. Kafka producer tagged with `PURPOSE_INGEST_PRODUCER`. E2e test validating metrics endpoint response and counter/gauge values.
Function Registry & Infrastructure `src/infrastructure/processes/functions_registry.rs`, `src/framework/core/infrastructure_map.rs`	Registry stores `estimated_clients` per process; records Kafka client creation/drops scaled by worker count. Infrastructure map annotates process diffs with reason (`no_change` vs `forced_always`) and invokes `record_function_process_diff_updated(reason)`.
Kafka Operations `src/cli/routines/peek.rs`, `src/infrastructure/processes/kafka_clickhouse_sync.rs`, `src/mcp/tools/sample_stream.rs`	Consumer/producer creation calls updated to include purpose parameters (`PURPOSE_PEEK_CONSUMER`, `PURPOSE_SYNC_CONSUMER`, `PURPOSE_SYNC_PRODUCER`, `PURPOSE_MCP_SAMPLE_CONSUMER`). Consumer type signature changed to `KafkaClientHandle<StreamConsumer>`.
Worker Exit Reporting `src/utilities/system.rs`, `packages/py-moose-lib/.../streaming_function_runner.py`, `packages/ts-moose-lib/src/cluster-utils.ts`	Rust: child-process exit classification (ok/err_code/signal/wait_err) passed to `record_function_worker_restart`. Python: atexit hook posts exit reason to `/metrics-logs`. TypeScript: worker exit event handler posts reason to management service at `127.0.0.1:MOOSE_MANAGEMENT_PORT`/metrics-logs with silent error handling.
Documentation `docs/operator/metrics.md`	New metrics endpoint specification documenting three metric series (`moose_kafka_client_gauge{purpose}`, `moose_function_worker_restarts_total{reason}`, `moose_function_process_diff_updated_total{reason}`), label schemas, steady-state behavior, and Phase 0/1 interpretation. Documents `MOOSE_KAFKA_CLIENT_METRICS_DISABLED` and metrics-logs payload allow-list.

Sequence Diagram(s)

sequenceDiagram
    participant W as Worker Process<br/>(Py/TS)
    participant M as Management Server<br/>(/metrics-logs)
    participant R as Metrics Registry<br/>(OpenMetrics)

    rect rgba(100, 150, 200, 0.5)
    note over W,R: Worker Exit Reporting Flow
    W->>W: Process exits/receives signal
    activate W
    W->>W: Classify exit reason<br/>(signal, code, error, ok)
    deactivate W
    
    W->>M: POST /metrics-logs<br/>{FunctionWorkerRestart{reason}}
    activate M
    M->>M: Parse event
    M->>R: Emit MetricEvent
    deactivate M
    
    activate R
    R->>R: Update counter:<br/>moose_function_worker_restarts_total{reason}
    deactivate R
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

ENG-2074: update cliLog to use structured logging #3425: Removes HTTP /logs endpoint in favor of stderr JSON; conflicts with new /metrics-logs endpoint routing in local_webserver.rs.
fix: startup uneasyness #4024: Overlaps on child-process restart monitoring (utilities/system.rs) and webserver metrics flow (cli/local_webserver.rs) modifications.
[2/n] Add instrumentation for moose aware logging #3187: Parallel instrumentation effort touching same code paths (functions_registry, local_webserver, kafka_clickhouse_sync) for tracing spans instead of metrics.

Suggested reviewers

callicles
LucioFranco

Poem

🔍 Kafka clients wake and sleep,
Metrics count each birth and keep,
Workers whisper exit tales,
Through the /metrics-logs trails—
Observability prevails! 📊

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	Title 'ENG-2894: kafka observability' accurately and concisely summarizes the main change—adding Kafka observability metrics to the /metrics endpoint.
Description check	✅ Passed	Description clearly relates to the changeset, detailing the three new metrics, KafkaClientHandle wrapper, worker restart tracking, and environment controls.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ENG-2794-redpanda-connection-churn-observability

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit f4786f5. Configure here.}

cursor · 2026-04-24T00:47:37Z

        exit_code = 1 if fatal_error.is_set() else 0
+        _set_worker_exit_reason(
+            "py_worker_exit_code_nonzero" if exit_code != 0 else "py_worker_exit_code_0"
+        )


Python finally block overwrites signal exit reason

High Severity

The finally block unconditionally calls _set_worker_exit_reason based on exit code, overwriting any signal-based reason set earlier by the shutdown handler. When a signal like SIGTERM is received, shutdown correctly sets the reason to py_worker_killed_by_SIGTERM, but then the finally block runs and replaces it with py_worker_exit_code_0 (since signal-based shutdowns don't set fatal_error). This makes the signal classification in the shutdown handler effectively dead code — all signal exits are misreported as clean exits.

Additional Locations (1)

packages/py-moose-lib/moose_lib/streaming/streaming_function_runner.py#L697-L702

^{Triggered by team rule: Pragmatic Developper}

^{Reviewed by Cursor Bugbot for commit f4786f5. Configure here.}

coderabbitai

Actionable comments posted: 14

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

apps/framework-cli/src/infrastructure/stream/kafka/client.rs (1)
546-589: 🛠️ Refactor suggestion | 🟠 Major

Public API signatures changed — docstrings didn't follow.

create_idempotent_producer now returns KafkaClientHandle<FutureProducer> (breaking change vs. prior FutureProducer); doc still says "A FutureProducer".

create_producer, create_consumer, create_subscriber gained a new purpose: &'static str parameter; none of the # Arguments blocks mention it, and none document the allowed label values.

As per coding guidelines ("Document all public APIs and breaking changes" / "Documentation is required for all public APIs in Rust"), please update the rustdoc blocks. Recommend pointing readers at the PURPOSE_* constants rather than letting new callers invent ad-hoc labels, which would fragment the moose_kafka_client_gauge{purpose=...} cardinality.

Also applies to: 704-716, 734-765
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/framework-cli/src/infrastructure/stream/kafka/client.rs` around lines
546 - 589, Update the rustdoc for the public producer/consumer functions to
reflect the new signatures and expected labels: change the return description of
create_idempotent_producer to state it returns KafkaClientHandle<FutureProducer>
(not FutureProducer), and add a new `purpose: &'static str` parameter
description to create_producer (and likewise to create_consumer and
create_subscriber) explaining that callers must supply one of the predefined
PURPOSE_* constants (reference PURPOSE_IDEMPOTENT_PRODUCER, PURPOSE_* constants)
and that these values are used as the `purpose` label for
moose_kafka_client_gauge; also list/point to the PURPOSE_* constants rather than
encouraging ad-hoc strings and update the Arguments and Returns sections
accordingly for each affected function (create_idempotent_producer,
create_producer, create_consumer, create_subscriber).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/framework-cli/docs/operator/metrics.md`:
- Around line 72-79: The doc is incorrect: kafka_client_tracking_enabled() in
metrics.rs currently treats any set value (including empty) as disabling
metrics; change the implementation in metrics.rs (function
kafka_client_tracking_enabled and reference KAFKA_METRICS_DISABLED_ENV) to
reject empty strings by using a check like
std::env::var(KAFKA_METRICS_DISABLED_ENV).is_ok_and(|v| !v.is_empty()), so the
kill-switch only trips for non-empty values (or alternatively update the doc to
remove "non-empty" if you prefer doc-only alignment).

In `@apps/framework-cli/src/cli/local_webserver.rs`:
- Around line 4274-4368: The test introduces three new public Prometheus series
(moose_kafka_client_gauge with label purpose,
moose_function_worker_restarts_total with label reason, and
moose_function_process_diff_updated_total with label reason) surfaced by
Metrics/Metrics::new and exposed by metrics_route; update the user docs in
apps/framework-docs-v2/content to document each series name, type (gauge or
counter), label meanings (purpose, reason), example metric lines (e.g.
moose_kafka_client_gauge{purpose="smoke_producer"} 1), and any related
configuration or opt-out notes so the new observability surface is discoverable;
also ensure the docs index/TOC includes this metrics page.

In `@apps/framework-cli/src/framework/core/infrastructure_map.rs`:
- Around line 1507-1518: The metric label for the restart reason is misleading:
change the branch that sets reason when process != target_process from
"forced_always" to "changed" (the variables involved are process,
target_process, and reason) so
crate::metrics::record_function_process_diff_updated(reason) emits "changed" for
actual config differences; keep the tracing::debug call but update any tests
that assert the old label to expect "changed" instead of "forced_always" in the
metrics assertions.

In `@apps/framework-cli/src/infrastructure/stream/kafka/client.rs`:
- Around line 35-55: Replace the free-form &'static str purpose labels with a
tiny newtype to prevent accidental new labels: declare pub struct
KafkaClientPurpose(&'static str) with a private constructor (or make the struct
non-exhaustive) and expose only the existing PURPOSE_* values as constants of
type KafkaClientPurpose (e.g., PURPOSE_INGEST_PRODUCER: KafkaClientPurpose).
Change any APIs that currently accept purpose: &'static str to accept
KafkaClientPurpose (look for usages of purpose and functions/methods that take a
&'static str purpose) so callers must use the predefined constants; keep the
inner string accessible only for metrics/label emission when needed. Ensure the
newtype derives Copy/Clone/Debug/Eq/PartialEq/Hash as appropriate.
- Around line 68-72: KafkaClientTracker::drop calls record_kafka_client_dropped
which ultimately routes through Metrics::try_send_metric_event and a global
mpsc; during runtime shutdown the receiver may be closed and drops on worker
threads can silently fail to decrement the gauge. Change the lifecycle path so
drops update a dedicated Gauge handle (or otherwise use an atomic/lock-protected
gauge API) instead of sending over the bounded mpsc: locate
KafkaClientTracker::drop and record_kafka_client_dropped and replace the
try_send-based decrement with a direct call to the Gauge decrement method (or
atomic decrement) provided by the metrics subsystem used by
start_listening_to_metrics, falling back to a no-op if the gauge handle is
unavailable to avoid panics during shutdown.
- Around line 74-78: Replace the derived Clone on KafkaClientHandle with an
explicit implementation: implement Clone only for T: Clone (impl<T: Clone> Clone
for KafkaClientHandle<T>) and add a doc comment on KafkaClientHandle explaining
that cloned handles share the same Arc<KafkaClientTracker> so only one "dropped"
event fires when the last handle is dropped; mention that StreamConsumer,
BaseConsumer, and AdminClient in rdkafka 0.36 do not implement Clone while
FutureProducer does, so the impl bounds keep behavior correct.

In `@apps/framework-cli/src/metrics.rs`:
- Around line 220-244: Add rustdoc comments for all public helpers to satisfy
the "Document all public APIs" rule: document try_send_metric_event to state
it's fire-and-forget / best-effort (may fail silently), document
kafka_client_created and kafka_client_dropped, function_worker_restart,
function_process_diff_updated and each record_* function with their purpose and
expected input, and document set_global_metrics_handle to state its
once-only/one-time initialization contract (what happens if called multiple
times) and kafka_client_tracking_enabled behavior; include short examples or
usage notes where helpful and call out any failure semantics or concurrency
guarantees for these symbols so callers know expected behavior.
- Around line 220-244: The current try_send_metric_event uses a bounded 32-slot
channel which can drop KafkaClientCreated/Dropped events and desync
moose_kafka_client_gauge; change this by removing the bounded mpsc for lifecycle
events and either (A) switch the MetricEvent channel to an unbounded channel
(use tokio::sync::mpsc::unbounded_channel and update the sender type so
try_send_metric_event uses unbounded_sender.send(...)) or (B, preferred) bypass
the channel for Kafka lifecycle metrics: give KafkaClientTracker direct access
to the Arc<Family<KafkaClientLabels, Gauge>> (moose_kafka_client_gauge) and
increment/decrement it in kafka_client_created and kafka_client_dropped (and
drop try_send_metric_event for these paths), ensuring no try_send is used from
Drop and avoiding lost events during bursts or shutdown.
- Around line 582-584: The function kafka_client_tracking_enabled currently
treats an empty KAFKA_METRICS_DISABLED_ENV as "disabled" because it only checks
std::env::var(...).is_err(); change it to consider the variable disabled only
when it is set to a non-empty value: read the env var with
std::env::var(KAFKA_METRICS_DISABLED_ENV) and return true (enabled) when the var
is unset or is an empty string, and return false (disabled) when the var is
present and non-empty (trim whitespace first to be safe).
- Around line 684-695: The test tracking_disabled_env_suppresses_events mutates
process-global env vars using std::env::set_var/remove_var which are unsafe on
Rust 1.94.0 and causes race conditions; wrap those calls in unsafe blocks and
mark the test with #[serial_test::serial] to prevent parallel execution, or
alternatively refactor kafka_client_tracking_enabled to accept an injectable env
value and call that from the test; ensure you update
tracking_disabled_env_suppresses_events to call unsafe {
std::env::set_var(KAFKA_METRICS_DISABLED_ENV, "1") } and unsafe {
std::env::set_var(...) / std::env::remove_var(...) } and add the serial_test
attribute so record_kafka_client_created and record_kafka_client_dropped run in
isolation.

In `@apps/framework-cli/src/utilities/system.rs`:
- Around line 343-362: The test classify_rust_child_exit_reasons uses
std::os::unix::process::ExitStatusExt and must be compiled only on Unix; add a
Unix-only configuration attribute (e.g., #[cfg(unix)]) to the test (or the
containing module) so Windows builds don't try to compile it. Locate the test
function classify_rust_child_exit_reasons and apply #[cfg(unix)] (or gate the
whole test module) to prevent Windows compilation errors while keeping the test
runnable on Unix.

In `@packages/py-moose-lib/moose_lib/streaming/streaming_function_runner.py`:
- Around line 695-703: The shutdown signal handler (shutdown) sets a
signal-based reason via _set_worker_exit_reason but the finally block later
unconditionally overwrites it with a default exit-code reason; change the
finally logic so it only sets the default py_worker_exit_code_... when no signal
reason is already recorded. Concretely, in the finally block (where you
currently call _set_worker_exit_reason for exit codes) query the existing reason
(add or use a getter like _get_worker_exit_reason or make
_set_worker_exit_reason accept an overwrite flag) and skip setting the exit-code
reason if the reason already starts with "py_worker_killed_by_" or is non-empty,
leaving the signal-set reason sticky after running.clear()/shutdown().
- Around line 59-77: Replace the bare try/except/pass in _report_worker_exit
with contextlib.suppress(Exception) to explicitly and narrowly suppress
exceptions; import contextlib if missing, wrap only the requests.post call (and
its timeout) inside the suppress block so atexit.register(_report_worker_exit)
cannot raise, and keep updating _worker_exit_reason via _set_worker_exit_reason
unchanged.

In `@packages/ts-moose-lib/src/cluster-utils.ts`:
- Around line 202-208: The code currently calls reportWorkerExit(...) before
checking this.shutdownInProgress, which causes clean shutdowns to be counted as
restarts (incrementing moose_function_worker_restarts_total); change the flow so
that you first check this.shutdownInProgress and only call reportWorkerExit and
schedule cluster.fork (setTimeout(() => cluster.fork(), RESTART_TIME_MS)) when
shutdownInProgress is false, or alter reportWorkerExit to accept a flag and skip
incrementing the restart metric when shutdownInProgress is true; update the call
site(s) (the block around reportWorkerExit, shutdownInProgress, cluster.fork,
and RESTART_TIME_MS) accordingly.

---

Outside diff comments:
In `@apps/framework-cli/src/infrastructure/stream/kafka/client.rs`:
- Around line 546-589: Update the rustdoc for the public producer/consumer
functions to reflect the new signatures and expected labels: change the return
description of create_idempotent_producer to state it returns
KafkaClientHandle<FutureProducer> (not FutureProducer), and add a new `purpose:
&'static str` parameter description to create_producer (and likewise to
create_consumer and create_subscriber) explaining that callers must supply one
of the predefined PURPOSE_* constants (reference PURPOSE_IDEMPOTENT_PRODUCER,
PURPOSE_* constants) and that these values are used as the `purpose` label for
moose_kafka_client_gauge; also list/point to the PURPOSE_* constants rather than
encouraging ad-hoc strings and update the Arguments and Returns sections
accordingly for each affected function (create_idempotent_producer,
create_producer, create_consumer, create_subscriber).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f25d073e-74fd-45e5-84e7-3bcc2d5e9809

📥 Commits

Reviewing files that changed from the base of the PR and between 2e13cfc and f4786f5.

📒 Files selected for processing (15)

apps/framework-cli/docs/operator/metrics.md
apps/framework-cli/src/cli.rs
apps/framework-cli/src/cli/local_webserver.rs
apps/framework-cli/src/cli/routines/peek.rs
apps/framework-cli/src/framework/core/infrastructure_map.rs
apps/framework-cli/src/infrastructure/processes/functions_registry.rs
apps/framework-cli/src/infrastructure/processes/kafka_clickhouse_sync.rs
apps/framework-cli/src/infrastructure/stream/kafka/client.rs
apps/framework-cli/src/infrastructure/stream/kafka/models.rs
apps/framework-cli/src/mcp/tools/sample_stream.rs
apps/framework-cli/src/metrics.rs
apps/framework-cli/src/metrics_inserter.rs
apps/framework-cli/src/utilities/system.rs
packages/py-moose-lib/moose_lib/streaming/streaming_function_runner.py
packages/ts-moose-lib/src/cluster-utils.ts

📜 Review details

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: Build CLI Binaries (linux-arm64-glibc, ubuntu-22-8-core, aarch64-unknown-linux-gnu)
GitHub Check: Build CLI Binaries (linux-x64-glibc, ubuntu-22-8-core, x86_64-unknown-linux-gnu)
GitHub Check: Build CLI Binaries (darwin-arm64, macos-14-large, aarch64-apple-darwin)
GitHub Check: Package and Publish Independant TS Package
GitHub Check: Test CLI (blacksmith-4vcpu-ubuntu-2404)
GitHub Check: Test CLI (macos-latest-large)
GitHub Check: Lints
GitHub Check: Check
GitHub Check: Cursor Bugbot
GitHub Check: Analyze (rust)

🧰 Additional context used

📓 Path-based instructions (8)

**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Run cargo clippy to ensure Rust code passes Clippy's linting standards before each commit

**/*.rs: Use thiserror with #[derive(thiserror::Error)] for error handling in Rust; define errors near fallibility unit (NO global Error type); NEVER use anyhow::Result
Use snake_case for functions/variables, PascalCase for types/traits, SCREAMING_SNAKE_CASE for constants in Rust
Use tuple structs with validation constructors for newtypes in Rust (e.g., struct UserId(String))
Write tests inline with #[cfg(test)] modules in Rust
Documentation is required for all public APIs in Rust
Run cargo clippy --all-targets -- -D warnings pre-commit; no warnings allowed in Rust

Files:

apps/framework-cli/src/metrics_inserter.rs
apps/framework-cli/src/cli.rs
apps/framework-cli/src/utilities/system.rs
apps/framework-cli/src/infrastructure/stream/kafka/models.rs
apps/framework-cli/src/cli/local_webserver.rs
apps/framework-cli/src/cli/routines/peek.rs
apps/framework-cli/src/framework/core/infrastructure_map.rs
apps/framework-cli/src/infrastructure/processes/kafka_clickhouse_sync.rs
apps/framework-cli/src/mcp/tools/sample_stream.rs
apps/framework-cli/src/infrastructure/processes/functions_registry.rs
apps/framework-cli/src/metrics.rs
apps/framework-cli/src/infrastructure/stream/kafka/client.rs

apps/framework-cli/**/*.rs

📄 CodeRabbit inference engine (apps/framework-cli/AGENTS.md)

apps/framework-cli/**/*.rs: Always run cargo clippy --all-targets -- -D warnings before commits; fix all warnings - no Clippy warnings may remain (treat warnings as errors)
Use rustfmt --edition 2021 for consistent formatting
Prefer top-of-file use statements over inline ones in function bodies; add use imports for fully-qualified paths; local use for disambiguation (e.g., name collisions) is fine
Write meaningful names for functions, variables, and types
Keep functions focused and modular
Document all public APIs and breaking changes
Use thiserror crate for error handling instead of anyhow::Result
Define errors near their unit of fallibility (no global Error types)
Use #[derive(thiserror::Error)] with #[error()] messages for error structs
Structure error types as: context struct + error enum + #[source] chaining
Define newtypes as tuple structs: struct UserId(u64);
Add validation constructors for newtypes: UserId::new(id: u64) -> Result<Self, Error>
Derive standard traits for newtypes: #[derive(Debug, Clone, PartialEq)]
Implement From/TryFrom for newtype conversions
Use derive_more or nutype crates to reduce boilerplate in newtype definitions
Use const for static values (prefer over static)
Use UPPER_SNAKE_CASE naming for constants
Scope constant visibility: pub(crate) > pub(super) > pub
Group related constants together
Write unit tests for all public functions
Test error conditions and edge cases in unit and integration tests

Files:

apps/framework-cli/src/metrics_inserter.rs
apps/framework-cli/src/cli.rs
apps/framework-cli/src/utilities/system.rs
apps/framework-cli/src/infrastructure/stream/kafka/models.rs
apps/framework-cli/src/cli/local_webserver.rs
apps/framework-cli/src/cli/routines/peek.rs
apps/framework-cli/src/framework/core/infrastructure_map.rs
apps/framework-cli/src/infrastructure/processes/kafka_clickhouse_sync.rs
apps/framework-cli/src/mcp/tools/sample_stream.rs
apps/framework-cli/src/infrastructure/processes/functions_registry.rs
apps/framework-cli/src/metrics.rs
apps/framework-cli/src/infrastructure/stream/kafka/client.rs

**/framework-cli/src/**

⚙️ CodeRabbit configuration file

**/framework-cli/src/**: When reviewing changes to Moose CLI:

Check if any user-facing changes were made (commands, flags, configs, apis, etc)

If yes, verify the documentation for THAT SPECIFIC feature is updated in apps/framework-docs-v2/content

If docs for that feature doesn't exist yet, it should be added. If the change removes public apis, the documentation for those should also be removed. Changing unrelated docs doesn't satisfy this requirement.

Files:

apps/framework-cli/src/metrics_inserter.rs
apps/framework-cli/src/cli.rs
apps/framework-cli/src/utilities/system.rs
apps/framework-cli/src/infrastructure/stream/kafka/models.rs
apps/framework-cli/src/cli/local_webserver.rs
apps/framework-cli/src/cli/routines/peek.rs
apps/framework-cli/src/framework/core/infrastructure_map.rs
apps/framework-cli/src/infrastructure/processes/kafka_clickhouse_sync.rs
apps/framework-cli/src/mcp/tools/sample_stream.rs
apps/framework-cli/src/infrastructure/processes/functions_registry.rs
apps/framework-cli/src/metrics.rs
apps/framework-cli/src/infrastructure/stream/kafka/client.rs

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Run linting checks before submitting PRs for TypeScript/JavaScript code

**/*.{ts,tsx,js,jsx}: Group imports by external deps, internal modules, types; use named exports from barrel files (index.ts)
Use camelCase for variable and function names, PascalCase for types/classes/components, UPPER_SNAKE_CASE for constants in TypeScript/JavaScript
Prefix unused variables with _ (e.g., _unusedParam) to bypass linting errors in TypeScript/JavaScript
Format with Prettier using experimentalTernaries: true; auto-formats on commit via Husky + lint-staged in TypeScript/JavaScript
ESLint extends Next.js, Turbo, TypeScript recommended; @typescript-eslint/no-explicit-any disabled

Files:

packages/ts-moose-lib/src/cluster-utils.ts

**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Prefer interfaces for objects, types for unions/intersections; explicit return types on public APIs in TypeScript

Files:

packages/ts-moose-lib/src/cluster-utils.ts

**/ts-moose-lib/src/**/*.ts

⚙️ CodeRabbit configuration file

**/ts-moose-lib/src/**/*.ts: When reviewing changes to typescript moose lib:

Check if any public apis were changed (class, method, type, config, etc).

If yes, verify the documentation for THAT SPECIFIC feature is updated in apps/framework-docs-v2/content.

If docs for that feature doesn't exist yet, it should be added. If the change removes public apis, the documentation for those should also be removed. Changing unrelated docs doesn't satisfy this requirement.

Files:

packages/ts-moose-lib/src/cluster-utils.ts

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide in Python
Use snake_case for functions/variables, PascalCase for classes, UPPER_SNAKE_CASE for constants in Python
Format Python code with Black (line-length 88); auto-formats on commit via Husky + lint-staged
Use type hints for function signatures and public APIs in Python
Use pytest with fixtures and parametrize decorators for Python tests

Files:

packages/py-moose-lib/moose_lib/streaming/streaming_function_runner.py

**/py-moose-lib/moose_lib/**/*.py

⚙️ CodeRabbit configuration file

**/py-moose-lib/moose_lib/**/*.py: When reviewing changes to python moose lib:

Check if any public apis were changed (class, method, type, config, etc)

If yes, verify the documentation for THAT SPECIFIC feature is updated in apps/framework-docs-v2/content

If docs for that feature doesn't exist yet, it should be added. If the change removes public apis, the documentation for those should also be removed. Changing unrelated docs doesn't satisfy this requirement.

Files:

packages/py-moose-lib/moose_lib/streaming/streaming_function_runner.py

🧠 Learnings (39)

📚 Learning: 2026-02-06T01:43:06.078Z

Learnt from: oatsandsugar
Repo: 514-labs/moosestack PR: 3462
File: apps/framework-cli/src/cli/routines/feedback.rs:40-72
Timestamp: 2026-02-06T01:43:06.078Z
Learning: In the Moose framework-cli (Rust), the telemetry functions capture_usage and wait_for_usage_capture use a fire-and-forget pattern: errors are logged internally and not surfaced to users. This behavior is intentional across all commands. Do not change error propagation for individual commands; any modification requires a broad refactor across the CLI telemetry layer. During reviews, treat changes to telemetry error reporting as high risk and scope changes to capture_usage/wait_for_usage_capture to the entire framework-cli, not single commands.

Applied to files:

apps/framework-cli/src/metrics_inserter.rs
apps/framework-cli/src/cli.rs
apps/framework-cli/src/utilities/system.rs
apps/framework-cli/src/infrastructure/stream/kafka/models.rs
apps/framework-cli/src/cli/local_webserver.rs
apps/framework-cli/src/cli/routines/peek.rs
apps/framework-cli/src/framework/core/infrastructure_map.rs
apps/framework-cli/src/infrastructure/processes/kafka_clickhouse_sync.rs
apps/framework-cli/src/mcp/tools/sample_stream.rs
apps/framework-cli/src/infrastructure/processes/functions_registry.rs
apps/framework-cli/src/metrics.rs
apps/framework-cli/src/infrastructure/stream/kafka/client.rs

📚 Learning: 2026-02-07T04:42:43.608Z

Learnt from: oatsandsugar
Repo: 514-labs/moosestack PR: 3468
File: apps/framework-cli/src/cli/routines/docs.rs:878-887
Timestamp: 2026-02-07T04:42:43.608Z
Learning: For the Moose CLI under apps/framework-cli, document and implement that Windows support is best-effort only, while macOS (open) and Linux (xdg-open) are the primary targets. Treat Windows-specific code paths (e.g., using cmd /c start) as fallback implementations and ensure they are not required to be as robust as the primary platforms. When adding Windows fallbacks, include clear caveats, avoid blocking on Windows-specific behavior, and add appropriate runtime checks, logs, and TODOs. This guidance applies to Rust source files under apps/framework-cli/src (and adjacent Rust files in the same module) to keep Windows handling clearly scoped and maintainable.

Applied to files:

apps/framework-cli/src/metrics_inserter.rs
apps/framework-cli/src/cli.rs
apps/framework-cli/src/utilities/system.rs
apps/framework-cli/src/infrastructure/stream/kafka/models.rs
apps/framework-cli/src/cli/local_webserver.rs
apps/framework-cli/src/cli/routines/peek.rs
apps/framework-cli/src/framework/core/infrastructure_map.rs
apps/framework-cli/src/infrastructure/processes/kafka_clickhouse_sync.rs
apps/framework-cli/src/mcp/tools/sample_stream.rs
apps/framework-cli/src/infrastructure/processes/functions_registry.rs
apps/framework-cli/src/metrics.rs
apps/framework-cli/src/infrastructure/stream/kafka/client.rs

📚 Learning: 2026-04-18T22:28:25.491Z

Learnt from: callicles
Repo: 514-labs/moosestack PR: 4024
File: packages/ts-moose-lib/src/consumption-apis/runner.ts:620-635
Timestamp: 2026-04-18T22:28:25.491Z
Learning: In `packages/ts-moose-lib/src/consumption-apis/runner.ts`, the `server.on("error", ...)` handler (which calls `process.exit(1)` on any server error, not just startup `EADDRINUSE`) is intentional. The PR/ENG-2768 design requires the worker to exit non-zero on any socket error so the Rust `RestartingProcess` circuit breaker in `apps/framework-cli/src/utilities/system.rs` can count rapid failures. Narrowing to the listen phase only would silently swallow post-listen socket errors and break the circuit-breaker story. Do not suggest scoping this to a `once` / startup-only handler.

Applied to files:

packages/ts-moose-lib/src/cluster-utils.ts
packages/py-moose-lib/moose_lib/streaming/streaming_function_runner.py

📚 Learning: 2026-04-01T07:34:38.401Z

Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: templates/typescript-agent/AGENTS.md:0-0
Timestamp: 2026-04-01T07:34:38.401Z
Learning: Applies to templates/typescript-agent/packages/moosestack-service/app/mcp/tools/**/*.ts : Return user-friendly error messages in MCP tool responses and do not expose internal error details or stack traces

Applied to files:

packages/ts-moose-lib/src/cluster-utils.ts

📚 Learning: 2026-04-01T07:34:38.401Z

Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: templates/typescript-agent/AGENTS.md:0-0
Timestamp: 2026-04-01T07:34:38.401Z
Learning: Applies to templates/typescript-agent/packages/moosestack-service/app/index.ts : Export new data models, APIs, and MCP tools from `packages/moosestack-service/app/index.ts` so MooseStack discovers them automatically

Applied to files:

packages/ts-moose-lib/src/cluster-utils.ts

📚 Learning: 2026-04-01T07:34:51.710Z

Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: templates/typescript-agent/packages/moosestack-service/AGENTS.md:0-0
Timestamp: 2026-04-01T07:34:51.710Z
Learning: Applies to templates/typescript-agent/packages/moosestack-service/app/index.ts : Export Moose-discovered primitives in `app/index.ts` as the public service entrypoint

Applied to files:

packages/ts-moose-lib/src/cluster-utils.ts

📚 Learning: 2026-04-01T07:34:38.401Z

Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: templates/typescript-agent/AGENTS.md:0-0
Timestamp: 2026-04-01T07:34:38.401Z
Learning: Applies to templates/typescript-agent/packages/moosestack-service/app/mcp/tools/**/*.ts : Return errors in MCP tools via `{ content: [...], isError: true }` instead of throwing exceptions

Applied to files:

packages/ts-moose-lib/src/cluster-utils.ts

📚 Learning: 2026-03-31T03:47:14.870Z

Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: templates/typescript-mcp/AGENTS.md:0-0
Timestamp: 2026-03-31T03:47:14.870Z
Learning: Applies to templates/typescript-mcp/packages/moosestack-service/app/apis/mcp.ts : Return user-friendly error messages in MCP tool responses instead of exposing internal error details or stack traces

Applied to files:

packages/ts-moose-lib/src/cluster-utils.ts

📚 Learning: 2026-04-18T22:28:28.229Z

Learnt from: callicles
Repo: 514-labs/moosestack PR: 4024
File: apps/framework-cli/src/utilities/native_infra/mod.rs:524-550
Timestamp: 2026-04-18T22:28:28.229Z
Learning: In `apps/framework-cli/src/utilities/native_infra/mod.rs` (514-labs/moosestack), `kill_pid_file` uses `kill -0` to poll liveness after SIGTERM. Zombie state is unreachable here: ClickHouse and Temporal are spawned with `kill_on_drop(false)` and the `Child` handle is never retained across CLI invocations. A subsequent `moose dev`/`moose clean` is a different process, so the kernel reparents the orphaned child to PID 1 (launchd/init), which reaps zombies immediately. Do not suggest replacing `kill -0` with `ps -o stat=` zombie detection — it adds parsing overhead for a scenario that cannot occur.

Applied to files:

packages/ts-moose-lib/src/cluster-utils.ts
apps/framework-cli/src/utilities/system.rs

📚 Learning: 2026-04-22T18:10:52.989Z

Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-cli/AGENTS.md:0-0
Timestamp: 2026-04-22T18:10:52.989Z
Learning: Applies to apps/framework-cli/**/*.rs : Prefer top-of-file `use` statements over inline ones in function bodies; add `use` imports for fully-qualified paths; local `use` for disambiguation (e.g., name collisions) is fine

Applied to files:

apps/framework-cli/src/cli.rs

📚 Learning: 2026-04-22T18:10:52.989Z

Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-cli/AGENTS.md:0-0
Timestamp: 2026-04-22T18:10:52.989Z
Learning: Applies to apps/framework-cli/**/*.rs : Test error conditions and edge cases in unit and integration tests

Applied to files:

apps/framework-cli/src/utilities/system.rs

📚 Learning: 2026-04-10T15:30:49.933Z

Learnt from: LucioFranco
Repo: 514-labs/moosestack PR: 3933
File: apps/devkafka/src/groups.rs:65-87
Timestamp: 2026-04-10T15:30:49.933Z
Learning: In `apps/devkafka/src/groups.rs` (514-labs/moosestack PR `#3933`), `choose_protocol()` and `remove_member()` intentionally rely on nondeterministic HashMap iteration order for protocol selection and leader failover. This is acceptable for a dev-only broker with typically 1-2 consumers. Do not flag nondeterministic protocol selection or leader election in these methods in future reviews.

Applied to files:

apps/framework-cli/src/infrastructure/stream/kafka/models.rs
apps/framework-cli/src/cli/routines/peek.rs
apps/framework-cli/src/infrastructure/processes/kafka_clickhouse_sync.rs

📚 Learning: 2026-01-20T15:49:26.025Z

Learnt from: 514Ben
Repo: 514-labs/moosestack PR: 3150
File: packages/py-moose-lib/moose_lib/dmv2/materialized_view.py:35-35
Timestamp: 2026-01-20T15:49:26.025Z
Learning: In the Python moose library (packages/py-moose-lib), prefer typing.Union[X, Y] syntax over PEP 604's X | Y for type annotations. Use Union when you want to explicitly convey a union in the type system, especially if readability or tooling friendliness (e.g., older type checkers) is a consideration. Reserve X | Y for simple, modern type hints where you want concise syntax and are sure the project’s type checker version supports PEP 604 without issues.

Applied to files:

packages/py-moose-lib/moose_lib/streaming/streaming_function_runner.py

📚 Learning: 2026-04-22T18:10:52.989Z

Learnt from: CR
Repo: 514-labs/moosestack PR: 0
File: apps/framework-cli/AGENTS.md:0-0
Timestamp: 2026-04-22T18:10:52.989Z
Learning: Applies to apps/framework-cli/**/*.rs : Document all public APIs and breaking changes

Applied to files:

apps/framework-cli/src/cli/local_webserver.rs
apps/framework-cli/src/framework/core/infrastructure_map.rs
apps/framework-cli/src/metrics.rs

📚 Learning: 2026-04-09T15:11:17.424Z

Learnt from: LucioFranco
Repo: 514-labs/moosestack PR: 3936
File: apps/framework-cli/src/cli/local_webserver.rs:3088-3090
Timestamp: 2026-04-09T15:11:17.424Z
Learning: In 514-labs/moosestack, `moose dev --alpha` (NativeInfraProvider) starts ONLY ClickHouse and Temporal as native binaries — no Docker containers are launched at all (Kafka/Redis native support is not yet implemented). Skipping Docker teardown wholesale in `apps/framework-cli/src/cli/local_webserver.rs` when `project.use_native_infra == true` is intentional and correct for the current scope. This will need revisiting when native Kafka/Redis support is added.

Applied to files:

apps/framework-cli/src/cli/local_webserver.rs

📚 Learning: 2026-02-08T22:20:08.757Z

Learnt from: oatsandsugar
Repo: 514-labs/moosestack PR: 3468
File: apps/framework-cli/src/cli/routines/docs.rs:1590-1838
Timestamp: 2026-02-08T22:20:08.757Z
Learning: In apps/framework-cli/src/cli/routines/**/*.rs (Moose CLI): For HTTP-dependent functionality like `fetch_docs`, prefer E2E or integration tests over unit tests. Unit testing HTTP fetching adds little value compared to integration testing and can be flaky due to network dependencies.

Applied to files:

apps/framework-cli/src/cli/local_webserver.rs

📚 Learning: 2026-03-24T19:15:00.795Z

Learnt from: DatGuyJonathan
Repo: 514-labs/moosestack PR: 3830
File: apps/framework-cli/src/infrastructure/olap/clickhouse/mod.rs:1638-1644
Timestamp: 2026-03-24T19:15:00.795Z
Learning: In `apps/framework-cli/src/infrastructure/olap/clickhouse/mod.rs`, `rls_bootstrap` (public) and `parse_row_policy_filter` (private) are intentionally covered by E2E tests in `apps/framework-cli-e2e/test/row-level-security.test.ts` (the "Row-Level Security" suite) rather than by inline `#[cfg(test)]` unit tests. Do not request additional unit tests for these functions in this file.

Applied to files:

apps/framework-cli/src/cli/local_webserver.rs

📚 Learning: 2026-04-18T22:28:30.509Z

Learnt from: callicles
Repo: 514-labs/moosestack PR: 4024
File: apps/framework-cli/src/cli/local_webserver.rs:600-609
Timestamp: 2026-04-18T22:28:30.509Z
Learning: In `apps/framework-cli/src/cli/local_webserver.rs` (514-labs/moosestack), the consumption-API retry loop in `get_consumption_api_res` is intentionally scoped to connect errors only (via `is_connect_error`). If `execute()` succeeds, the upstream has committed to a response; retrying on body-read failures after headers are received is unsafe because the retried response could differ. Do not suggest extending retries to body/stream errors here.

Applied to files:

apps/framework-cli/src/cli/local_webserver.rs

📚 Learning: 2026-04-10T15:29:00.915Z

Learnt from: LucioFranco
Repo: 514-labs/moosestack PR: 3933
File: apps/framework-cli/src/utilities/native_infra/devkafka.rs:21-39
Timestamp: 2026-04-10T15:29:00.915Z
Learning: In `apps/framework-cli/src/utilities/native_infra/devkafka.rs` (514-labs/moosestack), `health_check` intentionally uses `std::net::TcpStream::connect_timeout` (blocking) because its callers run inside a synchronous polling loop. Do not flag the blocking TCP connect as a concern; the function is not called from within an async context directly.

Applied to files:

apps/framework-cli/src/cli/local_webserver.rs
apps/framework-cli/src/cli/routines/peek.rs
apps/framework-cli/docs/operator/metrics.md
apps/framework-cli/src/infrastructure/processes/kafka_clickhouse_sync.rs
apps/framework-cli/src/infrastructure/processes/functions_registry.rs
apps/framework-cli/src/infrastructure/stream/kafka/client.rs

📚 Learning: 2026-04-10T15:16:06.497Z

Learnt from: LucioFranco
Repo: 514-labs/moosestack PR: 3933
File: apps/devkafka/src/broker.rs:134-149
Timestamp: 2026-04-10T15:16:06.497Z
Learning: In `apps/devkafka/src/broker.rs` (514-labs/moosestack), `reap_expired_members()` called inside `spawn_reaper` only performs HashMap iteration and member removal with no unwraps or fallible operations. It cannot panic, so no `catch_unwind` guard is needed. This is an intentional design for a dev-only embedded broker. Do not flag the lack of panic handling in `spawn_reaper`.

Applied to files:

apps/framework-cli/src/cli/routines/peek.rs

📚 Learning: 2026-02-06T01:42:35.875Z

Learnt from: oatsandsugar
Repo: 514-labs/moosestack PR: 3462
File: apps/framework-cli/src/cli/routines/feedback.rs:107-119
Timestamp: 2026-02-06T01:42:35.875Z
Learning: In modules under apps/framework-cli/src/cli/routines (any Rust file under subdirectories) use raw println! for multi-line help text and formatted usage examples. Do not rely on display::show_message_wrapper for help output; reserve it for status messages and user-visible progress updates. Ensure help text remains readable and properly formatted.

Applied to files:

apps/framework-cli/src/cli/routines/peek.rs

📚 Learning: 2026-02-07T04:42:46.739Z

Learnt from: oatsandsugar
Repo: 514-labs/moosestack PR: 3468
File: apps/framework-cli/src/cli/routines/docs.rs:40-48
Timestamp: 2026-02-07T04:42:46.739Z
Learning: In Rust files under apps/framework-cli/src/cli/routines/, prefer using inherent from_str methods for conversions when the conversion is only used in a single place and you are not using .parse() elsewhere. Avoid implementing std::str::FromStr purely for stylistic reasons if the conversion isn’t reused with .parse()."

Applied to files:

apps/framework-cli/src/cli/routines/peek.rs

📚 Learning: 2026-02-07T06:10:34.807Z

Learnt from: oatsandsugar
Repo: 514-labs/moosestack PR: 3468
File: apps/framework-cli/src/cli/routines/docs.rs:896-942
Timestamp: 2026-02-07T06:10:34.807Z
Learning: In apps/framework-cli/src/cli/routines/**/*.rs (Rust code for Moose CLI), variable shadowing is acceptable and idiomatic when progressively refining a value (e.g., from &str to String or from raw input to processed output). Consider this a preferred pattern in this codebase, but ensure shadowing is intentional and improves readability rather than obscuring the data flow.

Applied to files:

apps/framework-cli/src/cli/routines/peek.rs

📚 Learning: 2026-02-08T22:20:04.881Z

Learnt from: oatsandsugar
Repo: 514-labs/moosestack PR: 3468
File: apps/framework-cli/src/cli/routines/docs.rs:1590-1838
Timestamp: 2026-02-08T22:20:04.881Z
Learning: For HTTP-dependent functionality under apps/framework-cli/src/cli/routines (e.g., fetch_docs), prefer E2E or integration tests over unit tests. Unit tests for HTTP fetching can be flaky due to network variability and typically add less value than integration tests. When adding tests in this directory, categorize tests as integration/E2E where they exercise real HTTP behavior or client-server interactions; mock or stub HTTP only when necessary to test internal logic, not the HTTP layer itself.

Applied to files:

apps/framework-cli/src/cli/routines/peek.rs

📚 Learning: 2026-02-08T22:31:17.000Z

Learnt from: oatsandsugar
Repo: 514-labs/moosestack PR: 3468
File: apps/framework-cli/src/cli/routines/docs.rs:0-0
Timestamp: 2026-02-08T22:31:17.000Z
Learning: In Moose CLI (apps/framework-cli/src/cli/routines/...), when implementing raw output mode (--raw), ensure public entry points show_toc, fetch_page, search_toc, browse_docs return an empty Message (Message::new("".to_string(), "".to_string())) to avoid polluting piped output. Also wrap show_message! calls with if !raw guards to prevent intermediate messages from appearing in piped output. This pattern applies to all Rust files under apps/framework-cli/src/cli/routines/ and should be checked during reviews.

Applied to files:

apps/framework-cli/src/cli/routines/peek.rs

📚 Learning: 2026-04-12T12:58:18.735Z

Learnt from: 514Ben
Repo: 514-labs/moosestack PR: 3954
File: apps/framework-cli/src/framework/core/infrastructure_map.rs:2039-2042
Timestamp: 2026-04-12T12:58:18.735Z
Learning: 514-labs/moosestack — apps/framework-cli/src/framework/core/infrastructure_map.rs: InfrastructureMap::diff_dictionaries() uses dicts_equal_ignore_metadata(a, b) to ignore metadata when detecting dictionary updates. Treat metadata-only changes as no-ops in future reviews.

Applied to files:

apps/framework-cli/src/framework/core/infrastructure_map.rs

📚 Learning: 2026-03-24T20:17:43.845Z

Learnt from: DatGuyJonathan
Repo: 514-labs/moosestack PR: 3830
File: apps/framework-cli/src/cli/display/infrastructure.rs:675-677
Timestamp: 2026-03-24T20:17:43.845Z
Learning: In `apps/framework-cli/src/framework/core/infrastructure_map.rs` (514-labs/moosestack), renaming a `SelectRowPolicy` is always represented as a `Removed` + `Added` pair in `OlapChange`. The `Updated` variant only fires when non-name fields (column, claim, tables) change, so `before.name == after.name` is always true inside `OlapChange::SelectRowPolicy(Change::Updated { .. })`. Do not suggest rename-detection logic in that match arm.

Applied to files:

apps/framework-cli/src/framework/core/infrastructure_map.rs

📚 Learning: 2026-04-15T02:10:48.333Z

Learnt from: 514Ben
Repo: 514-labs/moosestack PR: 3997
File: apps/framework-cli/src/framework/core/infrastructure_map.rs:2015-2121
Timestamp: 2026-04-15T02:10:48.333Z
Learning: Repo 514-labs/moosestack — apps/framework-cli/src/framework/core/infrastructure_map.rs: diff_dictionaries() re-keys both self and target maps to canonical IDs via OlapDictionary::id(default_database) before diffing. Lifecycle gating: ExternallyManaged blocks CREATE/UPDATE, DeletionProtected blocks DROP. This prevents false Added/Removed when default_database differs.

Applied to files:

apps/framework-cli/src/framework/core/infrastructure_map.rs

📚 Learning: 2026-04-12T02:31:35.851Z

Learnt from: 514Ben
Repo: 514-labs/moosestack PR: 3954
File: apps/framework-cli/src/framework/core/infrastructure_map.rs:10256-10540
Timestamp: 2026-04-12T02:31:35.851Z
Learning: Repo 514-labs/moosestack — In apps/framework-cli/src/framework/core/infrastructure_map.rs, InfrastructureMap::diff_dictionaries() compares dictionaries by a canonical ID computed with default_database via OlapDictionary::id(default_database), preventing false Added/Removed when only default_database differs. Do not flag such cases as drift.

Applied to files:

apps/framework-cli/src/framework/core/infrastructure_map.rs

📚 Learning: 2026-04-14T16:21:40.959Z

Learnt from: 514Ben
Repo: 514-labs/moosestack PR: 3997
File: apps/framework-cli/src/framework/core/infrastructure_map.rs:0-0
Timestamp: 2026-04-14T16:21:40.959Z
Learning: Repo 514-labs/moosestack — apps/framework-cli/src/framework/core/infrastructure_map.rs: InfrastructureMap::diff_dictionaries() must gate UPDATE filtering on target_dict.life_cycle (not the current dict.life_cycle). ExternallyManaged blocks CREATE/UPDATE; DeletionProtected blocks DROP only.

Applied to files:

apps/framework-cli/src/framework/core/infrastructure_map.rs

📚 Learning: 2026-01-26T00:56:27.011Z

Learnt from: DatGuyJonathan
Repo: 514-labs/moosestack PR: 3400
File: apps/framework-cli/src/framework/core/infrastructure_map.rs:1238-1298
Timestamp: 2026-01-26T00:56:27.011Z
Learning: Repo 514-labs/moosestack — Workflows: The CLI’s workflow diff is intended to detect only Temporal schedule–affecting changes. In apps/framework-cli/src/framework/core/infrastructure_map.rs, workflows_config_equal should compare schedule, retries, and timeout only; it must not include tasks. Task code/config changes are picked up automatically when the orchestration worker restarts and should not trigger a WorkflowChange.

Applied to files:

apps/framework-cli/src/framework/core/infrastructure_map.rs

📚 Learning: 2026-04-12T02:31:22.486Z

Learnt from: 514Ben
Repo: 514-labs/moosestack PR: 3954
File: apps/framework-cli/src/framework/core/infrastructure_map.rs:0-0
Timestamp: 2026-04-12T02:31:22.486Z
Learning: In 514-labs/moosestack (apps/framework-cli/src/framework/core/infrastructure_map.rs), InfrastructureMap::diff_dictionaries must match dictionaries by canonical ID computed via OlapDictionary::id(default_database) for both source and target maps. Do not compare using raw HashMap keys, as keys can differ after default-database rewrites.

Applied to files:

apps/framework-cli/src/framework/core/infrastructure_map.rs

📚 Learning: 2026-04-14T18:49:56.425Z

Learnt from: 514Ben
Repo: 514-labs/moosestack PR: 3997
File: apps/framework-cli/src/framework/core/infrastructure_map.rs:4014-4021
Timestamp: 2026-04-14T18:49:56.425Z
Learning: Repo 514-labs/moosestack — apps/framework-cli/src/framework/core/infrastructure_map.rs: dicts_equal_ignore_metadata() now masks external-source credentials on both OlapDictionary clones via mask_dict_credentials() (in addition to clearing metadata) before comparison. mask_credentials_for_json_export() reuses mask_dict_credentials(), ensuring a single source of truth for dictionary credential redaction and preventing spurious UPDATEs after proto/Redis round-trips.

Applied to files:

apps/framework-cli/src/framework/core/infrastructure_map.rs

📚 Learning: 2026-04-08T00:08:36.251Z

Learnt from: phiSgr
Repo: 514-labs/moosestack PR: 3930
File: apps/framework-cli/src/cli/local_webserver.rs:3524-3535
Timestamp: 2026-04-08T00:08:36.251Z
Learning: In 514-labs/moosestack, the `/admin/inframap` and related `/admin/*` endpoints (e.g., `/admin/plan`, `/admin/integrate-changes`, `/admin/reality-check`) in `apps/framework-cli/src/cli/local_webserver.rs` are internal CLI-to-server communication endpoints. They are consumed programmatically by CLI routines (e.g., migration, remote-plan), not by end users. Do not flag missing framework-docs-v2 documentation for changes to these admin endpoints — the coding guideline requiring docs updates applies only to user-facing APIs, commands, flags, and configs.

Applied to files:

apps/framework-cli/docs/operator/metrics.md

📚 Learning: 2026-04-10T14:30:34.800Z

Learnt from: LucioFranco
Repo: 514-labs/moosestack PR: 3932
File: apps/framework-cli/src/cli.rs:0-0
Timestamp: 2026-04-10T14:30:34.800Z
Learning: In 514-labs/moosestack, `moose clean` calling `shutdown_embedded_servers()` in `apps/framework-cli/src/cli.rs` only closes in-memory handles in the current process. It cannot stop embedded servers (devredis, devkafka, ClickHouse/Temporal) owned by a separate `moose dev --alpha` process. Cross-process cleanup via PID files or a control socket is a known architectural gap shared across all embedded servers, intentionally deferred and out of scope for individual feature PRs. Do not flag this as a bug requiring immediate fix.

Applied to files:

apps/framework-cli/docs/operator/metrics.md

📚 Learning: 2026-04-10T15:16:06.242Z

Learnt from: LucioFranco
Repo: 514-labs/moosestack PR: 3933
File: apps/devkafka/src/handlers/fetch.rs:42-68
Timestamp: 2026-04-10T15:16:06.242Z
Learning: In 514-labs/moosestack, `apps/devkafka/src/handlers/fetch.rs` — `wait_any_notify` intentionally spawns one task per `Arc<Notify>` rather than using `futures::select_all`. This is an acceptable trade-off for the dev-only embedded Kafka broker (low partition count); adding a `futures` crate dependency just for `select_all` is not desired. Do not suggest replacing the spawn-based approach with `select_all` or similar combinators in future reviews.

Applied to files:

apps/framework-cli/src/infrastructure/processes/kafka_clickhouse_sync.rs
apps/framework-cli/src/infrastructure/stream/kafka/client.rs

📚 Learning: 2026-03-04T00:09:14.325Z

Learnt from: phiSgr
Repo: 514-labs/moosestack PR: 3678
File: apps/framework-cli/src/infrastructure/olap/clickhouse/model.rs:483-483
Timestamp: 2026-03-04T00:09:14.325Z
Learning: In apps/framework-cli/src/infrastructure/olap/clickhouse/model.rs, `ClickHouseColumn` and related structs (e.g., `ClickHouseTable`, `ClickHouseIndex`) are internal CLI infrastructure types, not user-facing public APIs. Inline `//` comments are acceptable for their fields; the rustdoc `///` requirement does not apply to these internal structs.

Applied to files:

apps/framework-cli/src/infrastructure/processes/kafka_clickhouse_sync.rs

📚 Learning: 2026-04-10T14:28:37.548Z

Learnt from: LucioFranco
Repo: 514-labs/moosestack PR: 3932
File: apps/devredis/src/lib.rs:18-22
Timestamp: 2026-04-10T14:28:37.548Z
Learning: In 514-labs/moosestack, apps/devredis uses a crate-wide `pub type Error = Box<dyn std::error::Error + Send + Sync>` and `pub type Result<T>` alias intentionally. It is a dev-only embedded Redis server and the boxed error is acceptable here; do not flag it for thiserror/typed-error refactoring.

Applied to files:

apps/framework-cli/src/mcp/tools/sample_stream.rs

📚 Learning: 2026-03-04T22:31:33.997Z

Learnt from: cjus
Repo: 514-labs/moosestack PR: 3684
File: apps/framework-cli/src/framework/core/infrastructure_map.rs:0-0
Timestamp: 2026-03-04T22:31:33.997Z
Learning: Repo 514-labs/moosestack — In apps/framework-cli/src/framework/core/infrastructure_map.rs, InfrastructureMap::to_proto serializes select_row_policies and InfrastructureMap::from_proto deserializes them back into the map. Proto/Redis round-trips now preserve row policies; do not flag select_row_policies as being dropped on load in future reviews.

Applied to files:

apps/framework-cli/src/mcp/tools/sample_stream.rs

🪛 Ruff (0.15.11)

packages/py-moose-lib/moose_lib/streaming/streaming_function_runner.py

[warning] 67-74: Use contextlib.suppress(Exception) instead of try-except-pass

Replace try-except-pass with with contextlib.suppress(Exception): ...

(SIM105)

[error] 73-74: try-except-pass detected, consider logging the exception

(S110)

[warning] 73-73: Do not catch blind exception: Exception

(BLE001)

[warning] 700-700: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (6)

apps/framework-cli/src/mcp/tools/sample_stream.rs (1)

19-19: LGTM — purpose plumbing is consistent.

Import and create_consumer(..., PURPOSE_MCP_SAMPLE_CONSUMER) call match the new 3-arg signature and mirror the peek.rs update.

Also applies to: 323-327

apps/framework-cli/src/utilities/system.rs (1)

16-31: LGTM — classification + emission are clean.

Exit reason derived once from the borrowed result (no move), then should_restart consumes the owned value. Labels align with the metric series.

Also applies to: 175-205

packages/py-moose-lib/moose_lib/streaming/streaming_function_runner.py (1)

66-77: Endpoint contract verified as correct. The /metrics-logs endpoint (lines 1350–1377 of local_webserver.rs) properly deserializes {"reason": "..."} into MetricEvent::FunctionWorkerRestart { reason }, explicitly handles it (line 1367), and always returns 200 OK without panicking. The variant is filtered from the sink batch as expected.
apps/framework-cli/src/cli.rs (1)
774-776: No action needed. The execution order is safe: run_local_infrastructure_with_timeout and redis_client setup happen before Metrics::new. The window between Metrics::new (line 757) and set_global_metrics_handle (line 777) contains only start_listening_to_metrics.await, which doesn't spawn tasks or call record_* functions. Additionally, with_global_metrics gracefully handles a missing global handle by becoming a no-op.
			> Likely an incorrect or invalid review comment.
packages/ts-moose-lib/src/cluster-utils.ts (1)
44-67: Payload format is correct; no parse failure will occur.

The Rust enum at line 49 uses #[serde(untagged)], and the FunctionWorkerRestart variant (lines 83–85) accepts exactly the structure being posted: { reason: String }. Untagged enums deserialize struct variants without a wrapper tag, so the bare { reason } payload matches the wire format precisely.
			> Likely an incorrect or invalid review comment.
apps/framework-cli/src/infrastructure/processes/kafka_clickhouse_sync.rs (1)

34-37: LGTM — purpose tags wired through consistently.

Both the topic-to-topic and topic-to-table paths now thread PURPOSE_SYNC_CONSUMER / PURPOSE_SYNC_PRODUCER through the factories; import set is tight.

Also applies to: 484-490, 596-601

coderabbitai · 2026-04-24T00:49:07Z

+## Kill-switch — `MOOSE_KAFKA_CLIENT_METRICS_DISABLED`
+
+Setting this env var to a non-empty value disables `kafka_client_gauge`
+instrumentation for the process. Used to rule out metric-tracking as a
+CPU/memory regression source during rollout. The
+`function_worker_restarts_total` and `function_process_diff_updated_total`
+counters are **not** gated by this switch — they're cheap and do not
+touch the hot Kafka client path.


⚠️ Potential issue | 🟡 Minor

Doc overstates the kill-switch condition.

kafka_client_tracking_enabled() in metrics.rs is std::env::var(KAFKA_METRICS_DISABLED_ENV).is_err(), so the switch trips on any set value — including empty string. Either tighten the check to reject empty (is_ok_and(|v| !v.is_empty())) or drop "non-empty" from the doc.

📝 Proposed doc tweak

-Setting this env var to a non-empty value disables `kafka_client_gauge` -instrumentation for the process. Used to rule out metric-tracking as a +Setting this env var to any value (including empty) disables `kafka_client_gauge` +instrumentation for the process. Used to rule out metric-tracking as a

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

## Kill-switch — `MOOSE_KAFKA_CLIENT_METRICS_DISABLED`

Setting this env var to a non-empty value disables `kafka_client_gauge`

instrumentation for the process. Used to rule out metric-tracking as a

CPU/memory regression source during rollout. The

`function_worker_restarts_total` and `function_process_diff_updated_total`

counters are **not** gated by this switch — they're cheap and do not

touch the hot Kafka client path.

## Kill-switch — `MOOSE_KAFKA_CLIENT_METRICS_DISABLED`

Setting this env var to any value (including empty) disables `kafka_client_gauge`

instrumentation for the process. Used to rule out metric-tracking as a

CPU/memory regression source during rollout. The

`function_worker_restarts_total` and `function_process_diff_updated_total`

counters are **not** gated by this switch — they're cheap and do not

touch the hot Kafka client path.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/framework-cli/docs/operator/metrics.md` around lines 72 - 79, The doc is incorrect: kafka_client_tracking_enabled() in metrics.rs currently treats any set value (including empty) as disabling metrics; change the implementation in metrics.rs (function kafka_client_tracking_enabled and reference KAFKA_METRICS_DISABLED_ENV) to reject empty strings by using a check like std::env::var(KAFKA_METRICS_DISABLED_ENV).is_ok_and(|v| !v.is_empty()), so the kill-switch only trips for non-empty values (or alternatively update the doc to remove "non-empty" if you prefer doc-only alignment).

coderabbitai · 2026-04-24T00:49:07Z

+    #[tokio::test]
+    async fn metrics_endpoint_e2e_exposes_new_churn_observability_series() {
+        use crate::metrics::{Metrics, TelemetryMetadata};
+        use hyper::service::service_fn;
+        use std::convert::Infallible;
+        use std::time::Duration;
+        use tokio::net::TcpListener;
+
+        let (metrics, rx) = Metrics::new(
+            TelemetryMetadata {
+                machine_id: "smoke".to_string(),
+                is_moose_developer: false,
+                metric_labels: None,
+                metric_endpoints: None,
+                is_production: false,
+                project_name: "smoke".to_string(),
+                export_metrics: false,
+            },
+            None,
+        );
+        let metrics = Arc::new(metrics);
+        metrics.start_listening_to_metrics(rx).await;
+
+        metrics.kafka_client_created("smoke_producer");
+        metrics.kafka_client_created("smoke_producer");
+        metrics.kafka_client_dropped("smoke_producer");
+        metrics.function_worker_restart("smoke_reason".to_string());
+        metrics.function_process_diff_updated("forced_always");
+        metrics.function_process_diff_updated("forced_always");
+
+        let listener = TcpListener::bind("127.0.0.1:0").await.unwrap();
+        let addr = listener.local_addr().unwrap();
+
+        let metrics_for_server = metrics.clone();
+        let server_task = tokio::spawn(async move {
+            loop {
+                let (stream, _) = match listener.accept().await {
+                    Ok(pair) => pair,
+                    Err(_) => return,
+                };
+                let io = TokioIo::new(stream);
+                let metrics_clone = metrics_for_server.clone();
+                tokio::spawn(async move {
+                    let service = service_fn(move |_req: Request<Incoming>| {
+                        let m = metrics_clone.clone();
+                        async move {
+                            let resp = metrics_route(m).await.unwrap();
+                            Ok::<_, Infallible>(resp)
+                        }
+                    });
+                    let _ = auto::Builder::new(TokioExecutor::new())
+                        .serve_connection(io, service)
+                        .await;
+                });
+            }
+        });
+
+        tokio::time::sleep(Duration::from_millis(100)).await;
+
+        let url = format!("http://{addr}/metrics");
+        let client = reqwest::Client::builder()
+            .timeout(Duration::from_secs(5))
+            .build()
+            .unwrap();
+        let resp = client.get(&url).send().await.expect("GET /metrics failed");
+        assert_eq!(resp.status().as_u16(), 200);
+        let body = resp.text().await.expect("read body");
+
+        assert!(
+            body.contains("# TYPE moose_kafka_client_gauge gauge"),
+            "missing kafka gauge TYPE header in /metrics body:\n{body}"
+        );
+        assert!(
+            body.contains(r#"moose_kafka_client_gauge{purpose="smoke_producer"} 1"#),
+            "expected gauge value of 1 after 2 creates + 1 drop:\n{body}"
+        );
+        assert!(
+            body.contains("# TYPE moose_function_worker_restarts counter"),
+            "missing worker restarts TYPE header:\n{body}"
+        );
+        assert!(
+            body.contains(r#"moose_function_worker_restarts_total{reason="smoke_reason"} 1"#),
+            "expected worker restart counter == 1:\n{body}"
+        );
+        assert!(
+            body.contains("# TYPE moose_function_process_diff_updated counter"),
+            "missing diff updated TYPE header:\n{body}"
+        );
+        assert!(
+            body.contains(r#"moose_function_process_diff_updated_total{reason="forced_always"} 2"#),
+            "expected diff counter == 2:\n{body}"
+        );
+
+        server_task.abort();
+    }


⚠️ Potential issue | 🟡 Minor

Document the new /metrics series.

This adds user-visible metrics surface, but the PR context does not include a matching apps/framework-docs-v2/content update for these new series/labels.

As per coding guidelines, **/framework-cli/src/**: "If any user-facing changes were made ... verify the documentation for THAT SPECIFIC feature is updated in apps/framework-docs-v2/content".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/framework-cli/src/cli/local_webserver.rs` around lines 4274 - 4368, The test introduces three new public Prometheus series (moose_kafka_client_gauge with label purpose, moose_function_worker_restarts_total with label reason, and moose_function_process_diff_updated_total with label reason) surfaced by Metrics/Metrics::new and exposed by metrics_route; update the user docs in apps/framework-docs-v2/content to document each series name, type (gauge or counter), label meanings (purpose, reason), example metric lines (e.g. moose_kafka_client_gauge{purpose="smoke_producer"} 1), and any related configuration or opt-out notes so the new observability surface is discoverable; also ensure the docs index/TOC includes this metrics page.

coderabbitai · 2026-04-24T00:49:07Z

+                let reason = if process == target_process {
+                    "no_change"
+                } else {
+                    "forced_always"
+                };
+                crate::metrics::record_function_process_diff_updated(reason);
+
+                tracing::debug!(
+                    "FunctionProcess updated (forced, diff_reason={}): {}",
+                    reason,
+                    id
+                );


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Verify current reason taxonomy and metric wiring for FunctionProcess diff updates. fd -i 'metrics.rs' apps/framework-cli/src --exec rg -n "function_process_diff_updated|record_function_process_diff_updated|reason|no_change|forced_always|changed" {} fd -i 'metrics_inserter.rs' apps/framework-cli/src --exec rg -n "FunctionProcessDiffUpdated|reason|no_change|forced_always|changed" {} rg -n --type=rust "record_function_process_diff_updated|FunctionProcessDiffUpdated|no_change|forced_always|changed" apps/framework-cli/src

Repository: 514-labs/moosestack

Length of output: 40534

🏁 Script executed:

# View the diff_function_processes function around lines 1507-1518 sed -n '1480,1530p' apps/framework-cli/src/framework/core/infrastructure_map.rs

Repository: 514-labs/moosestack

Length of output: 2036

🏁 Script executed:

# Search for function signature and understand what this function does rg -B 20 "let reason = if process == target_process" apps/framework-cli/src/framework/core/infrastructure_map.rs | head -60

Repository: 514-labs/moosestack

Length of output: 869

🏁 Script executed:

# Check if there are any comments or documentation about FunctionProcess diff semantics rg -B 5 -A 5 "diff_function_processes|FunctionProcess.*diff" apps/framework-cli/src/framework/core/infrastructure_map.rs | head -80

Repository: 514-labs/moosestack

Length of output: 1243

🏁 Script executed:

# Check documentation/comments about what "forced_always" means and FunctionProcess restart semantics rg -B 10 -A 5 "forced.*always|forced.*update.*process|FunctionProcess.*restart|restart.*process" apps/framework-cli/src/framework/core/infrastructure_map.rs | head -120

Repository: 514-labs/moosestack

Length of output: 722

🏁 Script executed:

# Check if there are similar patterns in orchestration/workflow diffs to understand the naming convention rg -B 3 -A 3 "record_function_process_diff_updated|OrchestrationWorker.*diff|Workflow.*diff.*reason" apps/framework-cli/src/framework/core/infrastructure_map.rs | head -100

Repository: 514-labs/moosestack

Length of output: 330

🏁 Script executed:

# Look at the test to understand the metric intent sed -n '8249,8300p' apps/framework-cli/src/framework/core/infrastructure_map.rs

Repository: 514-labs/moosestack

Length of output: 2377

🏁 Script executed:

# Look at the full test that uses "forced_always" to understand test expectations sed -n '658,690p' apps/framework-cli/src/metrics.rs

Repository: 514-labs/moosestack

Length of output: 1337

🏁 Script executed:

# Check the full test_diff_function_process_no_changes_triggers_update to understand intent sed -n '8249,8290p' apps/framework-cli/src/framework/core/infrastructure_map.rs

Repository: 514-labs/moosestack

Length of output: 2020

🏁 Script executed:

# Check if there are comments about why FunctionProcess is always forced to update rg -B 20 "diff_function_processes" apps/framework-cli/src/framework/core/infrastructure_map.rs | grep -A 20 "fn diff_function_processes"

Repository: 514-labs/moosestack

Length of output: 94

🏁 Script executed:

# Search for all usages of the reason labels to see if they're hardcoded in tests/monitoring rg "\"no_change\"|\"forced_always\"" apps/framework-cli/src --type=rust

Repository: 514-labs/moosestack

Length of output: 948

🏁 Script executed:

# Check if there's any documentation about what these metric reasons should mean rg -B 5 -A 10 "function_process_diff_updated|FunctionProcessDiffUpdated" apps/framework-cli/src/metrics.rs | grep -A 15 "pub fn function_process_diff_updated"

Repository: 514-labs/moosestack

Length of output: 717

🏁 Script executed:

# Check if there are any monitoring dashboards or prometheus config that hardcode "forced_always" rg "forced_always" apps/framework-cli --type=rust --type=yaml --type=json

Repository: 514-labs/moosestack

Length of output: 856

🏁 Script executed:

# Look at the git history or comments to understand why "forced_always" was chosen git log --oneline --all -S "forced_always" -- apps/framework-cli/src/framework/core/infrastructure_map.rs | head -5

Repository: 514-labs/moosestack

Length of output: 101

🏁 Script executed:

# Read more context around the test to understand what the intent is sed -n '8249,8330p' apps/framework-cli/src/framework/core/infrastructure_map.rs

Repository: 514-labs/moosestack

Length of output: 3600

🏁 Script executed:

# Check what the commit f4786f50 was about git show f4786f50 --stat | head -30

Repository: 514-labs/moosestack

Length of output: 1635

🏁 Script executed:

# Check if there are any other test files that might reference these reason labels find apps/framework-cli -name "*.rs" -type f -exec grep -l "forced_always\|no_change" {} \;

Repository: 514-labs/moosestack

Length of output: 319

🏁 Script executed:

# Check the broader context of how FunctionProcess updates are actually used/applied rg -B 5 -A 10 "ProcessChange::FunctionProcess.*Updated" apps/framework-cli/src --type=rust | head -80

Repository: 514-labs/moosestack

Length of output: 5267

🏁 Script executed:

# Verify if the function worker actually restarts based on this diff (is it truly "forced_always"?) rg -B 3 -A 3 "function_processes|FunctionProcess.*change" apps/framework-cli/src/framework/core/plan.rs | head -100

Repository: 514-labs/moosestack

Length of output: 45

Clarify reason labels to reflect actual vs identical diffs.

Line 1510 labels process != target_process as "forced_always", which obscures that the process config actually differs. For observability, the metric should distinguish whether a restart is due to an actual change or forced by the diff logic despite no change.

Proposed fix

- let reason = if process == target_process { - "no_change" - } else { - "forced_always" - }; + let reason = if process == target_process { + "forced_no_change" + } else { + "changed" + };

Update the metric tests in apps/framework-cli/src/metrics.rs (line 673–679) and apps/framework-cli/src/cli/local_webserver.rs (line 4301–4363) to use "changed" instead of "forced_always".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/framework-cli/src/framework/core/infrastructure_map.rs` around lines 1507 - 1518, The metric label for the restart reason is misleading: change the branch that sets reason when process != target_process from "forced_always" to "changed" (the variables involved are process, target_process, and reason) so crate::metrics::record_function_process_diff_updated(reason) emits "changed" for actual config differences; keep the tracing::debug call but update any tests that assert the old label to expect "changed" instead of "forced_always" in the metrics assertions.

coderabbitai · 2026-04-24T00:49:07Z

+pub const PURPOSE_INGEST_PRODUCER: &str = "ingest_producer";
+pub const PURPOSE_IDEMPOTENT_PRODUCER: &str = "idempotent_producer";
+pub const PURPOSE_SYNC_PRODUCER: &str = "sync_producer";
+pub const PURPOSE_SYNC_CONSUMER: &str = "sync_consumer";
+pub const PURPOSE_PEEK_CONSUMER: &str = "peek_consumer";
+pub const PURPOSE_MCP_SAMPLE_CONSUMER: &str = "mcp_sample_consumer";
+pub const PURPOSE_HEALTH_PROBE: &str = "health_probe";
+pub const PURPOSE_FETCH_TOPICS_CONSUMER: &str = "fetch_topics_consumer";
+pub const PURPOSE_FETCH_TOPICS_ADMIN: &str = "fetch_topics_admin";
+pub const PURPOSE_CHECK_TOPIC_SIZE_CONSUMER: &str = "check_topic_size_consumer";
+pub const PURPOSE_ADMIN_ADD_PARTITIONS: &str = "admin_add_partitions";
+pub const PURPOSE_ADMIN_UPDATE_TOPIC_CONFIG: &str = "admin_update_topic_config";
+pub const PURPOSE_ADMIN_CREATE_TOPICS: &str = "admin_create_topics";
+pub const PURPOSE_ADMIN_DELETE_TOPICS: &str = "admin_delete_topics";
+pub const PURPOSE_ADMIN_DESCRIBE_TOPIC_CONFIG: &str = "admin_describe_topic_config";
+// Option A (plan §0.6.7): function worker Kafka handles (TS/Python) are not
+// rdkafka, so we approximate their presence in the gauge with one tick per
+// parallel worker instance. Treat this as a proxy for "an external worker is
+// alive and holding at least one Kafka socket"; actual socket count per
+// worker is tracked separately on the language runtime side if/when needed.
+pub const PURPOSE_FUNCTION_WORKER_ESTIMATED: &str = "function_worker_estimated";


🧹 Nitpick | 🔵 Trivial

Consider narrowing purpose constants to a newtype to prevent cardinality drift.

Right now any &'static str satisfies purpose: &'static str, so a typo'd literal at a call site silently creates a new gauge series. A tiny newtype (pub struct KafkaClientPurpose(&'static str);) with only the PURPOSE_* constructors — or a non-exhaustive enum — would give you a compile-time allowlist and match the coding guideline preference for newtype wrappers. Not a blocker for the PR, but cheap insurance while the label set is still small.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/framework-cli/src/infrastructure/stream/kafka/client.rs` around lines 35 - 55, Replace the free-form &'static str purpose labels with a tiny newtype to prevent accidental new labels: declare pub struct KafkaClientPurpose(&'static str) with a private constructor (or make the struct non-exhaustive) and expose only the existing PURPOSE_* values as constants of type KafkaClientPurpose (e.g., PURPOSE_INGEST_PRODUCER: KafkaClientPurpose). Change any APIs that currently accept purpose: &'static str to accept KafkaClientPurpose (look for usages of purpose and functions/methods that take a &'static str purpose) so callers must use the predefined constants; keep the inner string accessible only for metrics/label emission when needed. Ensure the newtype derives Copy/Clone/Debug/Eq/PartialEq/Hash as appropriate.

coderabbitai · 2026-04-24T00:49:07Z

+impl Drop for KafkaClientTracker {
+    fn drop(&mut self) {
+        record_kafka_client_dropped(self.purpose);
+    }
+}


⚠️ Potential issue | 🟡 Minor

Drop → record_kafka_client_dropped → try_send can fire from any thread/runtime context.

KafkaClientTracker::drop ends up calling Metrics::try_send_metric_event through the global handle. If a KafkaClientHandle is dropped after the tokio runtime hosting start_listening_to_metrics has shut down (e.g., during process teardown), try_send on a closed channel returns Err which is let _'d — fine. But be aware that any Drop running on a worker thread during shutdown will also skip the decrement. Combined with the bounded-channel concern raised in metrics.rs, this is why I recommend the dedicated Gauge handle path instead of routing lifecycle events through the mpsc.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/framework-cli/src/infrastructure/stream/kafka/client.rs` around lines 68 - 72, KafkaClientTracker::drop calls record_kafka_client_dropped which ultimately routes through Metrics::try_send_metric_event and a global mpsc; during runtime shutdown the receiver may be closed and drops on worker threads can silently fail to decrement the gauge. Change the lifecycle path so drops update a dedicated Gauge handle (or otherwise use an atomic/lock-protected gauge API) instead of sending over the bounded mpsc: locate KafkaClientTracker::drop and record_kafka_client_dropped and replace the try_send-based decrement with a direct call to the Gauge decrement method (or atomic decrement) provided by the metrics subsystem used by start_listening_to_metrics, falling back to a no-op if the gauge handle is unavailable to avoid panics during shutdown.

coderabbitai · 2026-04-24T00:49:08Z

+    #[test]
+    fn tracking_disabled_env_suppresses_events() {
+        let prev = std::env::var(KAFKA_METRICS_DISABLED_ENV).ok();
+        std::env::set_var(KAFKA_METRICS_DISABLED_ENV, "1");
+        assert!(!kafka_client_tracking_enabled());
+        record_kafka_client_created("should_not_panic");
+        record_kafka_client_dropped("should_not_panic");
+        match prev {
+            Some(v) => std::env::set_var(KAFKA_METRICS_DISABLED_ENV, v),
+            None => std::env::remove_var(KAFKA_METRICS_DISABLED_ENV),
+        }
+    }


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Confirm the toolchain rustc version used by this crate to gauge set_var unsafety risk. fd -t f 'rust-toolchain*' apps/framework-cli -x cat {} rg -nP '^\s*edition\s*=' apps/framework-cli/Cargo.toml

Repository: 514-labs/moosestack

Length of output: 81

🏁 Script executed:

# Check global rust-toolchain at repo root find . -maxdepth 1 -name 'rust-toolchain*' -type f -exec cat {} \; # Check .github/workflows or CI config for Rust version find .github -name '*.yml' -o -name '*.yaml' | head -5 | xargs grep -l 'rust' 2>/dev/null | head -1 | xargs cat | grep -A 5 -B 5 'rust\|toolchain' | head -30 # Check if Cargo.toml specifies MSRV rg 'rust-version|msrv' Cargo.toml --ignore-case

Repository: 514-labs/moosestack

Length of output: 378

🏁 Script executed:

# Look for other env::set_var or env::remove_var usage in test code rg 'env::(set_var|remove_var)' --type rs -B 2 -A 5

Repository: 514-labs/moosestack

Length of output: 89

🏁 Script executed:

# Check the test configuration - look for test settings in Cargo.toml or setup rg '\[\[test\]\]|\[profile\.test\]' Cargo.toml -A 5

Repository: 514-labs/moosestack

Length of output: 45

🏁 Script executed:

# Fix: rg needs -t rust or just search .rs files directly rg 'env::(set_var|remove_var)' apps/framework-cli -t rust -B 2 -A 2

Repository: 514-labs/moosestack

Length of output: 8240

🏁 Script executed:

# Check if cargo test runs serially or in parallel rg 'test-threads|num_test_threads' apps/framework-cli/Cargo.toml

Repository: 514-labs/moosestack

Length of output: 45

🏁 Script executed:

# Look at the metrics.rs test module to understand test isolation head -n 700 apps/framework-cli/src/metrics.rs | tail -n 50

Repository: 514-labs/moosestack

Length of output: 1906

🏁 Script executed:

# Check if serial_test is already a dependency rg 'serial_test' apps/framework-cli/Cargo.toml

Repository: 514-labs/moosestack

Length of output: 106

Wrap env::set_var/remove_var calls in unsafe blocks and add test isolation.

On Rust 1.94.0, std::env::set_var and std::env::remove_var are unsafe fn. Currently missing unsafe blocks. Additionally, mutating process-global env state without isolation risks test races when running in parallel. Use #[serial_test::serial] (already in Cargo.toml) or refactor kafka_client_tracking_enabled() to accept an injectable parameter.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/framework-cli/src/metrics.rs` around lines 684 - 695, The test tracking_disabled_env_suppresses_events mutates process-global env vars using std::env::set_var/remove_var which are unsafe on Rust 1.94.0 and causes race conditions; wrap those calls in unsafe blocks and mark the test with #[serial_test::serial] to prevent parallel execution, or alternatively refactor kafka_client_tracking_enabled to accept an injectable env value and call that from the test; ensure you update tracking_disabled_env_suppresses_events to call unsafe { std::env::set_var(KAFKA_METRICS_DISABLED_ENV, "1") } and unsafe { std::env::set_var(...) / std::env::remove_var(...) } and add the serial_test attribute so record_kafka_client_created and record_kafka_client_dropped run in isolation.

coderabbitai · 2026-04-24T00:49:08Z

+    #[test]
+    fn classify_rust_child_exit_reasons() {
+        use std::os::unix::process::ExitStatusExt;
+        use std::process::ExitStatus;
+
+        let ok: std::io::Result<ExitStatus> = Ok(ExitStatus::from_raw(0));
+        assert_eq!(classify_rust_child_exit(&ok), "rust_child_exit_ok");
+
+        let err_code: std::io::Result<ExitStatus> = Ok(ExitStatus::from_raw(256));
+        assert_eq!(
+            classify_rust_child_exit(&err_code),
+            "rust_child_exit_err_code"
+        );
+
+        let signal: std::io::Result<ExitStatus> = Ok(ExitStatus::from_raw(9));
+        assert_eq!(classify_rust_child_exit(&signal), "rust_child_exit_signal");
+
+        let wait_err: std::io::Result<ExitStatus> = Err(std::io::Error::other("wait failure"));
+        assert_eq!(classify_rust_child_exit(&wait_err), "rust_child_wait_err");
+    }


🧹 Nitpick | 🔵 Trivial

Test is Unix-only; gate the module/test.

std::os::unix::process::ExitStatusExt won't compile on Windows. While framework-cli Windows support is best-effort, an ungated #[test] still breaks cargo test on Windows. Add #[cfg(unix)].

🔧 Proposed fix

+ #[cfg(unix)] #[test] fn classify_rust_child_exit_reasons() { use std::os::unix::process::ExitStatusExt;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

#[test]

fn classify_rust_child_exit_reasons() {

use std::os::unix::process::ExitStatusExt;

use std::process::ExitStatus;

let ok: std::io::Result<ExitStatus> = Ok(ExitStatus::from_raw(0));

assert_eq!(classify_rust_child_exit(&ok), "rust_child_exit_ok");

let err_code: std::io::Result<ExitStatus> = Ok(ExitStatus::from_raw(256));

assert_eq!(

classify_rust_child_exit(&err_code),

"rust_child_exit_err_code"

);

let signal: std::io::Result<ExitStatus> = Ok(ExitStatus::from_raw(9));

assert_eq!(classify_rust_child_exit(&signal), "rust_child_exit_signal");

let wait_err: std::io::Result<ExitStatus> = Err(std::io::Error::other("wait failure"));

assert_eq!(classify_rust_child_exit(&wait_err), "rust_child_wait_err");

}

#[cfg(unix)]

#[test]

fn classify_rust_child_exit_reasons() {

use std::os::unix::process::ExitStatusExt;

use std::process::ExitStatus;

let ok: std::io::Result<ExitStatus> = Ok(ExitStatus::from_raw(0));

assert_eq!(classify_rust_child_exit(&ok), "rust_child_exit_ok");

let err_code: std::io::Result<ExitStatus> = Ok(ExitStatus::from_raw(256));

assert_eq!(

classify_rust_child_exit(&err_code),

"rust_child_exit_err_code"

);

let signal: std::io::Result<ExitStatus> = Ok(ExitStatus::from_raw(9));

assert_eq!(classify_rust_child_exit(&signal), "rust_child_exit_signal");

let wait_err: std::io::Result<ExitStatus> = Err(std::io::Error::other("wait failure"));

assert_eq!(classify_rust_child_exit(&wait_err), "rust_child_wait_err");

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/framework-cli/src/utilities/system.rs` around lines 343 - 362, The test classify_rust_child_exit_reasons uses std::os::unix::process::ExitStatusExt and must be compiled only on Unix; add a Unix-only configuration attribute (e.g., #[cfg(unix)]) to the test (or the containing module) so Windows builds don't try to compile it. Locate the test function classify_rust_child_exit_reasons and apply #[cfg(unix)] (or gate the whole test module) to prevent Windows compilation errors while keeping the test runnable on Unix.

coderabbitai · 2026-04-24T00:49:08Z

+_worker_exit_reason: dict[str, str] = {"reason": "py_worker_exit_code_0"}
+
+
+def _set_worker_exit_reason(reason: str) -> None:
+    _worker_exit_reason["reason"] = reason
+
+
+def _report_worker_exit() -> None:
+    try:
+        requests.post(
+            f"http://localhost:{moose_management_port}/metrics-logs",
+            json={"reason": _worker_exit_reason["reason"]},
+            timeout=0.5,
+        )
+    except Exception:
+        pass
+
+
+atexit.register(_report_worker_exit)


🧹 Nitpick | 🔵 Trivial

Minor: use contextlib.suppress and scope the guard.

Ruff flags the bare try/except/pass (SIM105/S110). Swallowing here is intentional (atexit must not raise), but contextlib.suppress(Exception) reads better and the BLE001 noise stops.

🔧 Proposed fix

+import contextlib ... def _report_worker_exit() -> None: - try: - requests.post( + with contextlib.suppress(Exception): + requests.post( f"http://localhost:{moose_management_port}/metrics-logs", json={"reason": _worker_exit_reason["reason"]}, timeout=0.5, ) - except Exception: - pass

🧰 Tools

🪛 Ruff (0.15.11)

[warning] 67-74: Use contextlib.suppress(Exception) instead of try-except-pass

Replace try-except-pass with with contextlib.suppress(Exception): ...

(SIM105)

[error] 73-74: try-except-pass detected, consider logging the exception

(S110)

[warning] 73-73: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/py-moose-lib/moose_lib/streaming/streaming_function_runner.py` around lines 59 - 77, Replace the bare try/except/pass in _report_worker_exit with contextlib.suppress(Exception) to explicitly and narrowly suppress exceptions; import contextlib if missing, wrap only the requests.post call (and its timeout) inside the suppress block so atexit.register(_report_worker_exit) cannot raise, and keep updating _worker_exit_reason via _set_worker_exit_reason unchanged.

coderabbitai · 2026-04-24T00:49:08Z

    def shutdown(signum, frame):
        """Handle shutdown signals gracefully"""
        log("Received shutdown signal, cleaning up...")
+        try:
+            name = signal.Signals(signum).name
+        except Exception:
+            name = f"signal_{signum}"
+        _set_worker_exit_reason(f"py_worker_killed_by_{name}")
        running.clear()  # This will trigger the main loop to exit


⚠️ Potential issue | 🟠 Major

Signal-based exit reason is clobbered in finally.

Flow on SIGTERM/SIGINT/etc.:

shutdown() sets reason → py_worker_killed_by_SIGTERM.

running.clear() → main loop exits.

finally block unconditionally sets reason → py_worker_exit_code_0 (since fatal_error is typically not set on signal-triggered shutdown).

atexit reports py_worker_exit_code_0 — the signal context is lost.

Only overwrite when no signal reason was recorded, or treat a set signal reason as sticky.

🔧 Proposed fix

- exit_code = 1 if fatal_error.is_set() else 0 - _set_worker_exit_reason( - "py_worker_exit_code_nonzero" if exit_code != 0 else "py_worker_exit_code_0" - ) + exit_code = 1 if fatal_error.is_set() else 0 + # Preserve signal-based reason if one was already set by shutdown(). + if not _worker_exit_reason["reason"].startswith("py_worker_killed_by_"): + _set_worker_exit_reason( + "py_worker_exit_code_nonzero" if exit_code != 0 else "py_worker_exit_code_0" + )

Also applies to: 768-773

🧰 Tools

🪛 Ruff (0.15.11)

[warning] 695-695: Missing return type annotation for private function shutdown

Add return type annotation: None

(ANN202)

[warning] 695-695: Unused function argument: frame

(ARG001)

[warning] 700-700: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/py-moose-lib/moose_lib/streaming/streaming_function_runner.py` around lines 695 - 703, The shutdown signal handler (shutdown) sets a signal-based reason via _set_worker_exit_reason but the finally block later unconditionally overwrites it with a default exit-code reason; change the finally logic so it only sets the default py_worker_exit_code_... when no signal reason is already recorded. Concretely, in the finally block (where you currently call _set_worker_exit_reason for exit codes) query the existing reason (add or use a getter like _get_worker_exit_reason or make _set_worker_exit_reason accept an overwrite flag) and skip setting the exit-code reason if the reason already starts with "py_worker_killed_by_" or is non-empty, leaving the signal-set reason sticky after running.clear()/shutdown().

coderabbitai · 2026-04-24T00:49:08Z

+      try {
+        reportWorkerExit(code, signal);
+      } catch {}
+
      if (!this.shutdownInProgress) {
        setTimeout(() => cluster.fork(), RESTART_TIME_MS);
      }


⚠️ Potential issue | 🟠 Major

Don't count clean shutdown exits as restarts.

This reports before the shutdownInProgress guard, so deliberate SIGTERM/SIGINT exits also increment moose_function_worker_restarts_total.

Proposed fix

- try { - reportWorkerExit(code, signal); - } catch {} - if (!this.shutdownInProgress) { + try { + reportWorkerExit(code, signal); + } catch {} setTimeout(() => cluster.fork(), RESTART_TIME_MS); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try {

reportWorkerExit(code, signal);

} catch {}

if (!this.shutdownInProgress) {

setTimeout(() => cluster.fork(), RESTART_TIME_MS);

}

if (!this.shutdownInProgress) {

try {

reportWorkerExit(code, signal);

} catch {}

setTimeout(() => cluster.fork(), RESTART_TIME_MS);

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/ts-moose-lib/src/cluster-utils.ts` around lines 202 - 208, The code currently calls reportWorkerExit(...) before checking this.shutdownInProgress, which causes clean shutdowns to be counted as restarts (incrementing moose_function_worker_restarts_total); change the flow so that you first check this.shutdownInProgress and only call reportWorkerExit and schedule cluster.fork (setTimeout(() => cluster.fork(), RESTART_TIME_MS)) when shutdownInProgress is false, or alter reportWorkerExit to accept a flag and skip incrementing the restart metric when shutdownInProgress is true; update the call site(s) (the block around reportWorkerExit, shutdownInProgress, cluster.fork, and RESTART_TIME_MS) accordingly.

ENG-2894: kafka observability

f4786f5

adds observability metrics to the local /metrics endpoint. includes: > moose_kafka_client_gauge{purpose} > moose_function_worker_restarts_total{reason} > moose_function_process_diff_updated_total{reason}

vercel Bot deployed to Preview April 24, 2026 00:43 View deployment

cursor Bot reviewed Apr 24, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

Conversation

jmsuzuki commented Apr 24, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vercel Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linear Bot commented Apr 24, 2026

Uh oh!

jmsuzuki commented Apr 24, 2026

Uh oh!

coderabbitai Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Apr 24, 2026

Choose a reason for hiding this comment

Python finally block overwrites signal exit reason

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jmsuzuki commented Apr 24, 2026 •

edited by cursor Bot

Loading

vercel Bot commented Apr 24, 2026 •

edited

Loading

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading