diff --git a/docs/about-nemo-relay/release-notes/highlights.mdx b/docs/about-nemo-relay/release-notes/highlights.mdx index cef089033..86108f901 100644 --- a/docs/about-nemo-relay/release-notes/highlights.mdx +++ b/docs/about-nemo-relay/release-notes/highlights.mdx @@ -9,67 +9,75 @@ SPDX-License-Identifier: Apache-2.0 */} This page summarizes the notable capabilities in the current release documentation set. -## NeMo Relay 0.3 - -NeMo Relay 0.3 adds first-party guardrails, richer trajectory export, adaptive -plugin workflows, and clearer integration support for agent runtimes. - -### Breaking Changes - -- The project was renamed to NeMo Relay across documentation, package guidance, - and CLI-facing surfaces. -- The core runtime registry surface was narrowed. Applications should use the - documented middleware, intercept, subscriber, and plugin APIs instead of broad - registry access. -- Native subscriber delivery is now non-blocking. Event construction remains - synchronous, but subscriber callbacks and exporter work are queued on a - process-wide background dispatcher. Applications and tests that depend on - subscriber side effects must call the subscriber flush API before reading - captured events, files, or exported trace output. - -### Guardrails - -- Added the built-in `nemo_guardrails` plugin contract for installing NeMo - Guardrails behavior through NeMo Relay plugin configuration. -- Added a remote NeMo Guardrails backend for deployments that call an external - guardrails service. -- Added CLI editor support for guardrails plugin configuration. -- Added guardrail scopes for conditional guardrails so trace output shows - guardrail execution boundaries more clearly. -- Integrated security guardrails around managed agent calls. +## NVIDIA NeMo Relay 0.4 -### Observability +NVIDIA NeMo Relay 0.4 adds host plugin installation, first-party PII redaction, local +Guardrails execution, pricing-aware observability, streaming raw-event export, +and stronger coding-agent trace fidelity. + +### Compatibility Notes + +- The Python-only NeMo Guardrails example plugin was removed. Applications + should use the built-in `nemo_guardrails` component instead. +- Pricing estimates require an inline, file-backed, or embedding-provided + pricing source. NeMo Relay does not ship provider price data by default. +- Local NeMo Guardrails mode requires Python 3.11 or newer and + `nemoguardrails==0.22.0` in the runtime environment. + +### CLI And Plugin Configuration -- Exposed canonical ATOF event JSON across bindings so applications can consume - the same event shape from Rust, Python, and Node.js. -- Upgraded ATIF exporters to ATIF v1.7 with nested subagent support. -- Added first-class S3-compatible storage export for ATIF traces. -- Added streaming LLM chunk marks for more precise streaming trace inspection. -- Fixed LLM start event ordering so managed LLM start events are emitted before - execution intercepts. -- Fixed ATIF tool-observation correlation. +- Added `nemo-relay install`, `uninstall`, and `doctor --plugin` flows for + Claude Code and Codex host plugins. +- Added generated local marketplace support so Claude Code and Codex can load + NeMo Relay through native host plugin mechanisms. +- Layered code-driven plugin configuration over materialized global, project, + and user plugin files while preserving documented file precedence. +- Added pricing source management commands for validating catalogs, adding + file-backed sources, and resolving model pricing. -### Plugins And Adaptive Runtime +### Guardrails And Redaction + +- Added a local Python-backed backend for the built-in `nemo_guardrails` + component. +- Expanded Guardrails configuration docs for local and remote backend behavior, + supported codecs, request defaults, tool boundaries, and streaming behavior. +- Added the first-party PII redaction plugin crate with deterministic local + backend support and Python, Node.js, and WebAssembly helper surfaces. + +### Observability -- Enabled the adaptive plugin for the CLI and OpenClaw workflows. -- Added CLI editor support for adaptive plugin configuration. -- Added a Python context manager for plugin initialization and teardown. +- Added ATOF streaming endpoints for HTTP POST, WebSocket, and long-lived + NDJSON uploads. +- Added ATIF HTTP storage export alongside S3-compatible storage destinations. +- Added model-pricing lookup and cost layering for managed LLM responses. +- Propagated normalized LLM cost into ATIF step metrics, OpenInference + attributes, and OpenTelemetry attributes. +- Set OpenTelemetry span status from NeMo Relay execution results. +- Preserved sanitized LLM request payloads in annotations and start-event + output. ### Integrations -- Improved LangChain serialization for wrapped requests and responses. -- Preserved OpenClaw tool call replay visibility with the upgrade to OpenClaw 2026.5.26 -- Updated the launch banner to use `NEMO RELAY`. +- Improved Hermes hook injection, routed-provider observability, wrapped ATIF + fidelity, subagent lineage, and error-path export consistency. +- Added Codex observability contract coverage and suppressed noisy Claude Code + lifecycle events. +- Added nested subagent session lineage and more consistent cost, provenance, + replay, placeholder, and hook-only fallback exports for OpenClaw. +- Improved LangChain input serialization and fixed plugin context-manager + deadlock behavior. +- Added flattened OpenInference LLM attributes for annotations and replay. +- Annotated Deep Agents model responses and refreshed LangGraph callback + coverage. ### Documentation And Tooling -- Switched the documentation site to Fern and consolidated Fern publishing. -- Added broken-link validation for Fern documentation. -- Added agent runtime primer, trace incident runbook, plugin-building, migration, - and adaptive tuning guidance. -- Added built-in guardrails plugin documentation. -- Added CI path filters, CLI draft release assets, dependency updates, and ATIF - S3 storage test coverage. +- Updated installation documentation for 0.4 packages and host plugin setup. +- Clarified Hermes integration paths and coding-agent plugin source manifests. +- Added NVSkills CI request workflow and regenerated NeMo Relay skill eval + datasets. +- Updated dependency attribution files and isolated Rust test behavior for CI + and SonarQube cleanup. The complete changelog and release artifacts can be viewed on [GitHub Releases](https://github.com/NVIDIA/NeMo-Relay/releases). diff --git a/docs/about-nemo-relay/release-notes/index.mdx b/docs/about-nemo-relay/release-notes/index.mdx index 12e534f74..e22b0764d 100644 --- a/docs/about-nemo-relay/release-notes/index.mdx +++ b/docs/about-nemo-relay/release-notes/index.mdx @@ -12,18 +12,21 @@ tag-specific notes. ## Current Release -NeMo Relay 0.3 focuses on the renamed NeMo Relay runtime, plugin-driven agent -observability, guardrails integration, adaptive behavior for coding agents, and -ATIF v1.7 trace export. +NeMo Relay 0.4 focuses on host plugin installation, first-party PII redaction, +local NeMo Guardrails execution, pricing-aware LLM response annotations, +streaming raw ATOF export, remote ATIF storage, and stronger coding-agent trace +fidelity. The most important compatibility notes are: -- The project and package-facing documentation now use the NeMo Relay name. -- The core runtime registry surface has been narrowed. Code that registered - runtime behavior through older broad registry entry points should move to the - documented middleware, subscriber, and plugin APIs. -- Node.js 24 or newer is now the minimum supported Node.js version. -- Native subscriber delivery is now non-blocking. Code that depends on +- The Python-only NeMo Guardrails example plugin has been removed. Use the + built-in `nemo_guardrails` component for supported Guardrails integration. +- Pricing estimates require configured pricing sources. NeMo Relay does not + ship provider price data by default. +- Local NeMo Guardrails mode depends on Python 3.11 or newer and an available + `nemoguardrails` runtime. +- Node.js 24 or newer remains the minimum supported Node.js version. +- Native subscriber delivery remains asynchronous. Code that depends on subscriber callback side effects, exporter output, or deterministic test assertions must call the subscriber flush API. @@ -35,29 +38,33 @@ PR-by-PR changelog, release artifacts, and tag-specific history, use This release includes: -- Built-in NeMo Guardrails plugin support, including a remote backend and CLI - editor support. -- Canonical ATOF event JSON exposure across Rust, Python, and Node.js bindings. -- ATIF v1.7 exporter updates for nested subagents, tool-observation - correlation, and first-class S3-compatible storage export. -- Adaptive plugin support for the CLI and OpenClaw integration, including - editor support for adaptive configuration. -- Streaming LLM chunk marks and improved guardrail scope emission. -- Python plugin context-manager ergonomics. -- Non-blocking native subscriber delivery with flush APIs across Rust, Python, - Node.js, FFI/Go, and WebAssembly parity. -- LangChain serialization fixes and OpenClaw trace replay visibility fixes. -- Fern documentation publishing, link validation, and updated guidance for - agent runtime concepts, trace incident response, plugin building, and adaptive - tuning. +- Built-in NeMo Guardrails plugin support, including remote and local + Python-backed backends. +- CLI installation, uninstallation, and doctor flows for Claude Code and Codex + host plugins. +- Code-driven plugin configuration layered over materialized global, project, + and user plugin files. +- First-party PII redaction plugin helpers across Rust, Python, Node.js, and + WebAssembly surfaces. +- Model-pricing catalogs and LLM cost annotations that propagate into ATIF, + OpenInference, and OpenTelemetry output. +- ATOF streaming endpoints for HTTP POST, WebSocket, and NDJSON collectors. +- ATIF remote storage support for HTTP endpoints in addition to S3-compatible + storage. +- Trace fidelity fixes for Hermes, OpenClaw, LangChain, LangGraph, and Deep + Agents integrations. ## Feature Documentation -For the major 0.3 additions, start with: +For the major 0.4 additions, start with: +- [Plugin Installation](/nemo-relay-cli/plugin-installation) - [Observability Plugin](/observability-plugin/about) +- [Agent Trajectory Observability Format (ATOF)](/observability-plugin/atof) - [Agent Trajectory Interchange Format (ATIF)](/observability-plugin/atif) +- [Provider Response Codecs And Pricing](/integrate-into-frameworks/provider-response-codecs) - [NeMo Guardrails Plugin](/nemo-guardrails-plugin/about) -- [Adaptive Plugin](/adaptive-plugin/about) +- [PII Redaction API Reference](/reference/api/python-library-reference/pii-redaction) - [OpenClaw Plugin Guide](/supported-integrations/openclaw-plugin) +- [Hermes CLI Guide](/nemo-relay-cli/hermes) - [LangChain Integration Guide](/supported-integrations/langchain) diff --git a/docs/about-nemo-relay/release-notes/known-issues.mdx b/docs/about-nemo-relay/release-notes/known-issues.mdx index 550ad78ae..8ead12a6f 100644 --- a/docs/about-nemo-relay/release-notes/known-issues.mdx +++ b/docs/about-nemo-relay/release-notes/known-issues.mdx @@ -9,10 +9,10 @@ SPDX-License-Identifier: Apache-2.0 */} This page lists current limitations and support notes for the release documentation set. -## NeMo Relay 0.3 +## NeMo Relay 0.4 -These notes apply to the NeMo Relay 0.3 release. The following known issues -and limitations apply to NeMo Relay 0.3: +These notes apply to the NeMo Relay 0.4 release. The following known issues +and limitations apply to NeMo Relay 0.4: - Go, WebAssembly, and the raw C FFI surface are experimental and source-first. - Generated API pages cover Rust, Python, and Node.js. Experimental bindings do @@ -20,6 +20,11 @@ and limitations apply to NeMo Relay 0.3: - The NeMo Relay CLI is experimental. Coding agent observability support varies due to capabilities of hooks. Any encountered problems should be filed as bugs. +- Host plugin mode depends on each coding-agent host's plugin, hook, and + provider-routing behavior. Hooks alone cannot produce complete LLM request + and response spans. +- Complete first-request capture in Codex plugin mode depends on Codex firing + an installed hook before the first provider request. - Node.js 24 or newer is required for Node.js binding and package workflows. - OpenClaw support uses public hook-backed telemetry with partial security and optimization support. Security is limited to pre-tool conditional guardrails, @@ -27,8 +32,23 @@ and limitations apply to NeMo Relay 0.3: a managed execution path. - The NeMo Guardrails plugin remote backend depends on the availability, latency, and policy behavior of the configured remote service. +- The NeMo Guardrails plugin local backend starts a Python worker subprocess. + That worker environment must provide Python 3.11 or newer and + `nemoguardrails==0.22.0`; `SUPPORTED_NEMOGUARDRAILS_VERSION` in + `crates/core/src/plugins/nemo_guardrails/local_worker.py` is the + authoritative pin. +- The PII redaction plugin currently provides deterministic local backend + support. Local-model backend configuration is reserved for future expansion. +- Pricing estimates depend on configured pricing sources and the freshness of + the source catalog. Unknown model pricing and missing token data leave cost + absent instead of defaulting to zero. +- ATOF streaming endpoints depend on collector availability. Failed endpoints + are skipped or retried without blocking file output or other configured + endpoints. - S3-compatible ATIF export requires valid storage credentials and endpoint configuration in the runtime environment. +- Remote ATIF storage requires valid destination credentials and endpoint + configuration in the runtime environment. - `LLMRequest` objects in the Python binding should be treated as immutable. Request middleware that changes content should return a new request object. - Native subscriber callbacks are delivered asynchronously. Flush subscribers @@ -36,15 +56,27 @@ and limitations apply to NeMo Relay 0.3: exporter output. Deregistering a subscriber affects future emissions, but callbacks from already-queued event snapshots may still run. -### Fixed in NeMo Relay 0.3 +### Fixed in NeMo Relay 0.4 -- Managed LLM start events are emitted before execution intercepts. -- Coding-agent trace scopes are aligned with NeMo Relay agent scope semantics. -- ATIF tool observations are correlated with their matching tool calls. -- OpenClaw tool call replay visibility is preserved. +- ATIF shutdown no longer deadlocks queued subscribers. +- Sanitized LLM requests are resolved from annotations for observability output. +- Structured ATIF tool results and Hermes tool-result observations are + preserved more reliably. +- Hermes routed-provider spans, wrapped ATIF fidelity, subagent lineage, and + error-path export consistency are covered and corrected. +- OpenClaw observability output is more consistent for nested subagents, model + timing diagnostics, hook-backed provenance, placeholder replay, and + hook-only fallback exports. - LangChain serialization handles wrapped integration payloads more reliably. +- Plugin context-manager teardown avoids the previous deadlock path. +- Node.js `withScope` callbacks receive a real `ScopeHandle`. +- Deep Agents model responses are annotated for downstream observability. ### Fixed in Earlier Releases +- Managed LLM start events are emitted before execution intercepts. +- Coding-agent trace scopes are aligned with NeMo Relay agent scope semantics. +- ATIF tool observations are correlated with their matching tool calls. +- OpenClaw tool call replay visibility is preserved. - Enabled TLS support for OTLP HTTP export. - Preserved Go scope stacks across OS threads.