Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 59 additions & 51 deletions docs/about-nemo-relay/release-notes/highlights.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,67 +9,75 @@ SPDX-License-Identifier: Apache-2.0 */}
This page summarizes the notable capabilities in the current release
documentation set.

## NeMo Relay 0.3

NeMo Relay 0.3 adds first-party guardrails, richer trajectory export, adaptive
plugin workflows, and clearer integration support for agent runtimes.

### Breaking Changes

- The project was renamed to NeMo Relay across documentation, package guidance,
and CLI-facing surfaces.
- The core runtime registry surface was narrowed. Applications should use the
documented middleware, intercept, subscriber, and plugin APIs instead of broad
registry access.
- Native subscriber delivery is now non-blocking. Event construction remains
synchronous, but subscriber callbacks and exporter work are queued on a
process-wide background dispatcher. Applications and tests that depend on
subscriber side effects must call the subscriber flush API before reading
captured events, files, or exported trace output.

### Guardrails

- Added the built-in `nemo_guardrails` plugin contract for installing NeMo
Guardrails behavior through NeMo Relay plugin configuration.
- Added a remote NeMo Guardrails backend for deployments that call an external
guardrails service.
- Added CLI editor support for guardrails plugin configuration.
- Added guardrail scopes for conditional guardrails so trace output shows
guardrail execution boundaries more clearly.
- Integrated security guardrails around managed agent calls.
## NVIDIA NeMo Relay 0.4

### Observability
NVIDIA NeMo Relay 0.4 adds host plugin installation, first-party PII redaction, local
Guardrails execution, pricing-aware observability, streaming raw-event export,
and stronger coding-agent trace fidelity.

### Compatibility Notes

- The Python-only NeMo Guardrails example plugin was removed. Applications
should use the built-in `nemo_guardrails` component instead.
- Pricing estimates require an inline, file-backed, or embedding-provided
pricing source. NeMo Relay does not ship provider price data by default.
- Local NeMo Guardrails mode requires Python 3.11 or newer and
`nemoguardrails==0.22.0` in the runtime environment.

### CLI And Plugin Configuration

- Exposed canonical ATOF event JSON across bindings so applications can consume
the same event shape from Rust, Python, and Node.js.
- Upgraded ATIF exporters to ATIF v1.7 with nested subagent support.
- Added first-class S3-compatible storage export for ATIF traces.
- Added streaming LLM chunk marks for more precise streaming trace inspection.
- Fixed LLM start event ordering so managed LLM start events are emitted before
execution intercepts.
- Fixed ATIF tool-observation correlation.
- Added `nemo-relay install`, `uninstall`, and `doctor --plugin` flows for
Claude Code and Codex host plugins.
- Added generated local marketplace support so Claude Code and Codex can load
NeMo Relay through native host plugin mechanisms.
- Layered code-driven plugin configuration over materialized global, project,
and user plugin files while preserving documented file precedence.
- Added pricing source management commands for validating catalogs, adding
file-backed sources, and resolving model pricing.

### Plugins And Adaptive Runtime
### Guardrails And Redaction

- Added a local Python-backed backend for the built-in `nemo_guardrails`
component.
- Expanded Guardrails configuration docs for local and remote backend behavior,
supported codecs, request defaults, tool boundaries, and streaming behavior.
- Added the first-party PII redaction plugin crate with deterministic local
backend support and Python, Node.js, and WebAssembly helper surfaces.

### Observability

- Enabled the adaptive plugin for the CLI and OpenClaw workflows.
- Added CLI editor support for adaptive plugin configuration.
- Added a Python context manager for plugin initialization and teardown.
- Added ATOF streaming endpoints for HTTP POST, WebSocket, and long-lived
NDJSON uploads.
- Added ATIF HTTP storage export alongside S3-compatible storage destinations.
- Added model-pricing lookup and cost layering for managed LLM responses.
- Propagated normalized LLM cost into ATIF step metrics, OpenInference
attributes, and OpenTelemetry attributes.
- Set OpenTelemetry span status from NeMo Relay execution results.
- Preserved sanitized LLM request payloads in annotations and start-event
output.

### Integrations

- Improved LangChain serialization for wrapped requests and responses.
- Preserved OpenClaw tool call replay visibility with the upgrade to OpenClaw 2026.5.26
- Updated the launch banner to use `NEMO RELAY`.
- Improved Hermes hook injection, routed-provider observability, wrapped ATIF
fidelity, subagent lineage, and error-path export consistency.
- Added Codex observability contract coverage and suppressed noisy Claude Code
lifecycle events.
- Added nested subagent session lineage and more consistent cost, provenance,
replay, placeholder, and hook-only fallback exports for OpenClaw.
- Improved LangChain input serialization and fixed plugin context-manager
deadlock behavior.
- Added flattened OpenInference LLM attributes for annotations and replay.
- Annotated Deep Agents model responses and refreshed LangGraph callback
coverage.

### Documentation And Tooling

- Switched the documentation site to Fern and consolidated Fern publishing.
- Added broken-link validation for Fern documentation.
- Added agent runtime primer, trace incident runbook, plugin-building, migration,
and adaptive tuning guidance.
- Added built-in guardrails plugin documentation.
- Added CI path filters, CLI draft release assets, dependency updates, and ATIF
S3 storage test coverage.
- Updated installation documentation for 0.4 packages and host plugin setup.
- Clarified Hermes integration paths and coding-agent plugin source manifests.
- Added NVSkills CI request workflow and regenerated NeMo Relay skill eval
datasets.
- Updated dependency attribution files and isolated Rust test behavior for CI
and SonarQube cleanup.

The complete changelog and release artifacts can be viewed on
[GitHub Releases](https://github.com/NVIDIA/NeMo-Relay/releases).
59 changes: 33 additions & 26 deletions docs/about-nemo-relay/release-notes/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,18 +12,21 @@ tag-specific notes.

## Current Release

NeMo Relay 0.3 focuses on the renamed NeMo Relay runtime, plugin-driven agent
observability, guardrails integration, adaptive behavior for coding agents, and
ATIF v1.7 trace export.
NeMo Relay 0.4 focuses on host plugin installation, first-party PII redaction,
local NeMo Guardrails execution, pricing-aware LLM response annotations,
streaming raw ATOF export, remote ATIF storage, and stronger coding-agent trace
fidelity.

The most important compatibility notes are:

- The project and package-facing documentation now use the NeMo Relay name.
- The core runtime registry surface has been narrowed. Code that registered
runtime behavior through older broad registry entry points should move to the
documented middleware, subscriber, and plugin APIs.
- Node.js 24 or newer is now the minimum supported Node.js version.
- Native subscriber delivery is now non-blocking. Code that depends on
- The Python-only NeMo Guardrails example plugin has been removed. Use the
built-in `nemo_guardrails` component for supported Guardrails integration.
- Pricing estimates require configured pricing sources. NeMo Relay does not
ship provider price data by default.
- Local NeMo Guardrails mode depends on Python 3.11 or newer and an available
`nemoguardrails` runtime.
- Node.js 24 or newer remains the minimum supported Node.js version.
- Native subscriber delivery remains asynchronous. Code that depends on
subscriber callback side effects, exporter output, or deterministic test
assertions must call the subscriber flush API.

Expand All @@ -35,29 +38,33 @@ PR-by-PR changelog, release artifacts, and tag-specific history, use

This release includes:

- Built-in NeMo Guardrails plugin support, including a remote backend and CLI
editor support.
- Canonical ATOF event JSON exposure across Rust, Python, and Node.js bindings.
- ATIF v1.7 exporter updates for nested subagents, tool-observation
correlation, and first-class S3-compatible storage export.
- Adaptive plugin support for the CLI and OpenClaw integration, including
editor support for adaptive configuration.
- Streaming LLM chunk marks and improved guardrail scope emission.
- Python plugin context-manager ergonomics.
- Non-blocking native subscriber delivery with flush APIs across Rust, Python,
Node.js, FFI/Go, and WebAssembly parity.
- LangChain serialization fixes and OpenClaw trace replay visibility fixes.
- Fern documentation publishing, link validation, and updated guidance for
agent runtime concepts, trace incident response, plugin building, and adaptive
tuning.
- Built-in NeMo Guardrails plugin support, including remote and local
Python-backed backends.
- CLI installation, uninstallation, and doctor flows for Claude Code and Codex
host plugins.
- Code-driven plugin configuration layered over materialized global, project,
and user plugin files.
- First-party PII redaction plugin helpers across Rust, Python, Node.js, and
WebAssembly surfaces.
- Model-pricing catalogs and LLM cost annotations that propagate into ATIF,
OpenInference, and OpenTelemetry output.
- ATOF streaming endpoints for HTTP POST, WebSocket, and NDJSON collectors.
- ATIF remote storage support for HTTP endpoints in addition to S3-compatible
storage.
- Trace fidelity fixes for Hermes, OpenClaw, LangChain, LangGraph, and Deep
Agents integrations.

## Feature Documentation

For the major 0.3 additions, start with:
For the major 0.4 additions, start with:

- [Plugin Installation](/nemo-relay-cli/plugin-installation)
- [Observability Plugin](/observability-plugin/about)
- [Agent Trajectory Observability Format (ATOF)](/observability-plugin/atof)
- [Agent Trajectory Interchange Format (ATIF)](/observability-plugin/atif)
- [Provider Response Codecs And Pricing](/integrate-into-frameworks/provider-response-codecs)
- [NeMo Guardrails Plugin](/nemo-guardrails-plugin/about)
- [Adaptive Plugin](/adaptive-plugin/about)
- [PII Redaction API Reference](/reference/api/python-library-reference/pii-redaction)
- [OpenClaw Plugin Guide](/supported-integrations/openclaw-plugin)
- [Hermes CLI Guide](/nemo-relay-cli/hermes)
- [LangChain Integration Guide](/supported-integrations/langchain)
48 changes: 40 additions & 8 deletions docs/about-nemo-relay/release-notes/known-issues.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,42 +9,74 @@ SPDX-License-Identifier: Apache-2.0 */}
This page lists current limitations and support notes for the release
documentation set.

## NeMo Relay 0.3
## NeMo Relay 0.4

These notes apply to the NeMo Relay 0.3 release. The following known issues
and limitations apply to NeMo Relay 0.3:
These notes apply to the NeMo Relay 0.4 release. The following known issues
and limitations apply to NeMo Relay 0.4:

- Go, WebAssembly, and the raw C FFI surface are experimental and source-first.
- Generated API pages cover Rust, Python, and Node.js. Experimental bindings do
not yet have the same generated documentation depth.
- The NeMo Relay CLI is experimental. Coding agent observability support varies
due to capabilities of hooks. Any encountered problems should be filed as
bugs.
- Host plugin mode depends on each coding-agent host's plugin, hook, and
provider-routing behavior. Hooks alone cannot produce complete LLM request
and response spans.
- Complete first-request capture in Codex plugin mode depends on Codex firing
an installed hook before the first provider request.
- Node.js 24 or newer is required for Node.js binding and package workflows.
- OpenClaw support uses public hook-backed telemetry with partial security and
optimization support. Security is limited to pre-tool conditional guardrails,
and optimization is limited to adaptive telemetry unless the integration owns
a managed execution path.
- The NeMo Guardrails plugin remote backend depends on the availability,
latency, and policy behavior of the configured remote service.
- The NeMo Guardrails plugin local backend starts a Python worker subprocess.
That worker environment must provide Python 3.11 or newer and
`nemoguardrails==0.22.0`; `SUPPORTED_NEMOGUARDRAILS_VERSION` in
`crates/core/src/plugins/nemo_guardrails/local_worker.py` is the
authoritative pin.
- The PII redaction plugin currently provides deterministic local backend
support. Local-model backend configuration is reserved for future expansion.
- Pricing estimates depend on configured pricing sources and the freshness of
the source catalog. Unknown model pricing and missing token data leave cost
absent instead of defaulting to zero.
- ATOF streaming endpoints depend on collector availability. Failed endpoints
are skipped or retried without blocking file output or other configured
endpoints.
- S3-compatible ATIF export requires valid storage credentials and endpoint
configuration in the runtime environment.
- Remote ATIF storage requires valid destination credentials and endpoint
configuration in the runtime environment.
- `LLMRequest` objects in the Python binding should be treated as immutable.
Request middleware that changes content should return a new request object.
- Native subscriber callbacks are delivered asynchronously. Flush subscribers
before relying on callback side effects, captured event lists, files, or
exporter output. Deregistering a subscriber affects future emissions, but
callbacks from already-queued event snapshots may still run.

### Fixed in NeMo Relay 0.3
### Fixed in NeMo Relay 0.4

- Managed LLM start events are emitted before execution intercepts.
- Coding-agent trace scopes are aligned with NeMo Relay agent scope semantics.
- ATIF tool observations are correlated with their matching tool calls.
- OpenClaw tool call replay visibility is preserved.
- ATIF shutdown no longer deadlocks queued subscribers.
- Sanitized LLM requests are resolved from annotations for observability output.
- Structured ATIF tool results and Hermes tool-result observations are
preserved more reliably.
- Hermes routed-provider spans, wrapped ATIF fidelity, subagent lineage, and
error-path export consistency are covered and corrected.
- OpenClaw observability output is more consistent for nested subagents, model
timing diagnostics, hook-backed provenance, placeholder replay, and
hook-only fallback exports.
- LangChain serialization handles wrapped integration payloads more reliably.
- Plugin context-manager teardown avoids the previous deadlock path.
- Node.js `withScope` callbacks receive a real `ScopeHandle`.
- Deep Agents model responses are annotated for downstream observability.

### Fixed in Earlier Releases

- Managed LLM start events are emitted before execution intercepts.
- Coding-agent trace scopes are aligned with NeMo Relay agent scope semantics.
- ATIF tool observations are correlated with their matching tool calls.
- OpenClaw tool call replay visibility is preserved.
- Enabled TLS support for OTLP HTTP export.
- Preserved Go scope stacks across OS threads.
Loading