feat(pandas): include= and long-format sidecar dataframes by termoshtt · Pull Request #846 · Jij-Inc/ommx

termoshtt · 2026-04-28T05:30:41Z

Summary

Wave 1.5 of the v3 alpha pandas surface, scoped to include= filtering plus long-format / id-indexed sidecar DataFrames. The kind=Literal[...] consolidation of today's per-kind *_df methods stays out of scope for this PR.

include= parameter on every wide *_df accessor across Instance / ParametricInstance / Solution / SampleSet. Default ("metadata", "parameters") matches the v2-equivalent shape; include=[] drops both, include=["metadata"] / include=["parameters"] keep just the named family. Unknown values raise ValueError. Implementation post-filters per-row dicts in entries_to_dataframe based on a IncludeFlags derived from the argument — the ToPandasEntry impls themselves are unchanged.
Property → method conversion for every *_df accessor. v3-alpha breaking change: instance.constraints_df → instance.constraints_df(). The existing 11 in-tree call sites, the one doctest in solution.rs, and the Sphinx book (en + ja) are all updated. removed_reasons_df / *_removed_reasons_df also lose the property syntax for API consistency, even though they don't take include= (no metadata / parameters columns to filter).
Long-format / id-indexed sidecar DataFrames on Instance / ParametricInstance / Solution / SampleSet, reading directly from the SoA metadata stores:
- constraint_metadata_df(kind=...) — id-indexed wide; columns name / subscripts / description, index = {kind}_constraint_id.
- constraint_parameters_df(kind=...) — long format ({kind}_constraint_id, key, value). One row per (id, parameter_key) pair.
- constraint_provenance_df(kind=...) — long format ({kind}_constraint_id, step, source_kind, source_id). One row per (id, step) in the provenance chain.
- constraint_removed_reasons_df(kind=...) — long format ({kind}_constraint_id, reason, key, value). One row per (id, parameter_key) pair, plus one row with key / value = NA for ids whose reason has no parameters.
- variable_metadata_df() — id-indexed wide; index = variable_id.
- variable_parameters_df() — long format (variable_id, key, value).
kind accepts "regular" / "indicator" / "one_hot" / "sos1" (default "regular"); unknown values raise ValueError. Index column names are kind-qualified (regular_constraint_id ≠ indicator_constraint_id) so accidental cross-id-space joins surface in df.head() / df.info(). Long-format *_parameters_df / *_removed_reasons_df rows are sorted by (id, key), and empty long-format DataFrames keep their column schema instead of returning a column-less frame.
Macro refactor of the per-host kind dispatch. Each of the four hosts gets a constraint_kind_collection! macro that collapses the 16-arm ConstraintKind match (4 sidecars × 4 kinds) into 4 macro invocations. Net −354 lines across instance.rs / parametric_instance.rs / solution.rs / sample_set.rs.
set_parameter_columns sorts keys before emitting parameters.{key} columns, so wide-*_df parameter column order matches the long-format sort and stays process-stable even when the upstream Constraint.set_parameters is given a std::HashMap (whose iteration is randomized per process). The helper is generic over the hasher so it accepts both FnvHashMap (SoA stores) and std::HashMap (v1::Parameter).
Codex review pass caught three correctness bugs that are now fixed:
- Solution / SampleSet sidecar accessors no longer double-count removed constraints (EvaluatedCollection.inner() already includes them — chaining .removed_reasons().keys() produced duplicate rows).
- WithMetadata<EvaluatedDecisionVariable> / WithMetadata<WithSampleIds<SampledDecisionVariable>> now emit parameters.{key} columns, so Solution.decision_variables_df(include=["parameters"]) and the SampleSet equivalent return the right data.
- WithSampleIds<SampledNamedFunction> emits parameters.{key} columns instead of a single bare parameters string column, so apply_include_filter honours include=[] here too.
Book updates (docs/en/**/*.md + docs/ja/**/*.md): tutorial / user_guide / release-note pages flip to the method form, {attr} cross-references for *_df switch to {meth}, and a new "Unreleased" entry on release_note/ommx-3.0.md covers the property → method conversion, include=, and the six new sidecar accessors. 2.x release notes are added to nb_execution_excludepatterns (same treatment as 1.x — historical code samples reference removed APIs).
Proposal doc (METADATA_STORAGE_V3.md): reflects the narrow per-collection metadata accessors landed in feat(metadata-soa): move per-element metadata onto collection-level SoA stores #843, the explicit "no *_collection_mut()" policy, the actual insert_with(id, c, metadata) -> () shape, and now also the Wave 1.5 landing — include= and the sidecar accessors are marked landed (PR feat(pandas): include= and long-format sidecar dataframes #846), with the deferred kind=Literal[...] consolidation, Series accessors, and Attached wrappers spelled out as the remaining Wave 2 work.

Out of scope

constraints_df(kind=Literal["regular", "indicator", "one_hot", "sos1"]) consolidating today's per-kind *_constraints_df family — separate PR. This PR keeps the per-kind methods as-is.
pandas.Series[ID -> Object] collection accessors and the Standalone / Attached two-mode wrappers — Wave 2 of METADATA_STORAGE_V3.md.
NamedFunction SoA migration.

Test plan

task python:test passes locally with the full PR.
task python:book:test rebuilds en + ja docs cleanly with all code-cells executing.
Snapshot-based python/ommx-tests/tests/test_dataframe_include.py — 9 cases covering default vs. empty vs. metadata-only vs. parameters-only on Instance.decision_variables_df, Instance.constraints_df, Solution.decision_variables_df, SampleSet.decision_variables_df, plus an unknown-flag ValueError check. Snapshots written to __snapshots__/test_dataframe_include.ambr.
Snapshot-based python/ommx-tests/tests/test_dataframe_sidecars.py — 16 cases covering all 6 sidecar accessors (id-indexed shape, long-format shape, kind= dispatch on every constraint family, qualified index names, empty-provenance baseline, provenance after convert_one_hot_to_constraint, removed-reasons after relax_constraint, removed-reasons with parameters after one-hot conversion, Solution / SampleSet parity with Instance) plus 4 regression tests for the Codex fixes (Solution / SampleSet sidecar dedup with a relaxed constraint, Solution / SampleSet decision_variables_df(include=["parameters"]) actually emits the columns). Snapshots in __snapshots__/test_dataframe_sidecars.ambr.
5 consecutive pytest runs all pass — confirms determinism after the long-format (id, key) sort and the alphabetised wide parameters.{key} order.
Two Codex full-PR review passes; the second came back with one MUST-FIX (a JA / EN inconsistency in instance.md already in place before this PR but surfaced by the docs review) and two non-blocking notes that are scoped out (pd.DataFrame(columns=[...]) produces object dtypes for the empty-frame case; historical 2.x release-note prose references *_df without parens — kept as name reference, not invocation).

🤖 Generated with Claude Code

Reflect the narrow per-collection metadata accessors added on Instance / ParametricInstance during PR #843 review (`*_metadata()` / `*_metadata_mut()`), the explicit policy of not exposing `*_collection_mut()`, and the actual `insert_with(id, c, metadata) -> ()` shape on `ConstraintCollection<T>`. Drop the stale "Parse helpers exposed" claim — those `*_to_v1` helpers are crate-internal and the element-level `from_bytes`/`to_bytes` rationale was removed in #845. Add a Follow-ups section listing the still-`pub` `active_mut`/`removed_mut`/`insert_with`, the deferred `NamedFunction` SoA migration, and the Python Series / `include=` / sidecar wave 2 work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add a Sequence[str] include= parameter to the wide DataFrame methods on Instance / ParametricInstance / Solution / SampleSet. The default ("metadata", "parameters") preserves the v2-equivalent column shape; include=[] yields core columns only; include=["metadata"] / include=["parameters"] keep just the named family. Unknown values raise ValueError. Implementation: each per-row dict is post-filtered in entries_to_dataframe based on the resolved IncludeFlags. The ToPandasEntry impls are unchanged — they still emit every column unconditionally and the helper drops the gated ones before the DataFrame is built. Mechanical breaking change: every `*_df` accessor turns from #[getter] (property) into a regular method, so call sites must add parentheses. Existing tests + the one doctest in solution.rs are updated. removed_reasons_df / *_removed_reasons_df also lose their property syntax for API consistency, even though they don't take include= (no metadata/parameters columns to filter). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Six new accessors on Instance / ParametricInstance / Solution / SampleSet that read directly from the SoA metadata stores: - constraint_metadata_df(kind=...) — id-indexed wide; columns name / subscripts / description, index = `{kind}_constraint_id`. - constraint_parameters_df(kind=...) — long format `{kind}_constraint_id, key, value`. One row per (id, parameter_key) pair; ids without parameters contribute zero rows. - constraint_provenance_df(kind=...) — long format `{kind}_constraint_id, step, source_kind, source_id`. One row per (id, step) in the provenance chain. Empty for directly-authored constraints. - constraint_removed_reasons_df(kind=...) — long format `{kind}_constraint_id, reason, key, value`. One row per (id, parameter_key) pair, plus one row with key/value=NA for ids whose reason has no parameters. - variable_metadata_df() — id-indexed wide for decision variables; index = `variable_id`. - variable_parameters_df() — long format `variable_id, key, value`. `kind` accepts "regular" / "indicator" / "one_hot" / "sos1" (default "regular"); unknown values raise ValueError. Index column names are kind-qualified (`regular_constraint_id` ≠ `indicator_constraint_id`) so accidental cross-id-space joins are visible in df.head() etc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Update the proposal doc to mark `include=` and the long-format sidecar dataframes as landed, leaving only the Series accessors, the two-mode Attached wrappers, and the `kind=Literal[...]` consolidation as deferred to Wave 2. Specifically: - Status line: split out Wave 1.5 from the deferred bucket. - Python target shape: replace "*_df still v2-style; include=/sidecars deferred" with the actual landed shape (`*_df()` is a method, `include=Sequence[str]`, six sidecar accessors). - Resolved decisions: clarify that `kind=` is `str` today (not `Literal[...]`), and that `include=("removed_reasons",)` is not part of the landed include= surface. - Avoiding cross-ID-space joins: distinguish the qualified index names that already ship on sidecars from the wide-`*_df` rename bundled with the deferred consolidation. - ToPandasEntry restructuring: add the post-filter / sidecar-builder paths landed in PR #846. - Breaking changes: new "Landed in PR #846" subsection covering the property → method conversion plus include= and sidecar shapes; drop the include=/sidecar bullets from "Deferred to follow-up wave". - Follow-ups: add the `kind=Literal[...]` + `@overload` consolidation and the `include=("removed_reasons",)` pivot as explicit deferred items. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace explicit per-cell assertions on `*_df()` output with snapshot-based checks. The `.ambr` files are the authoritative description of each method's column / index schema under each configuration; update via `pytest --snapshot-update` after a deliberate API change. Behavioral assertions stay explicit where snapshotting doesn't help: - `test_unknown_include_flag_raises_value_error` / `test_unknown_kind_raises_value_error` (ValueError checks). - `test_solution_constraint_metadata_df_matches_instance` / `test_sample_set_variable_metadata_df_matches_instance` (parity via `pd.testing.assert_frame_equal`). Also sort long-format `*_parameters_df` and `*_removed_reasons_df` rows by `(id, key)` in the Rust builder. The upstream `Constraint.set_parameters` accepts `std::HashMap<String, String>`, whose iteration order is randomized per process — without an explicit sort the snapshot would flap between runs. Sorted output is also more useful for human readers of the DataFrame. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

1. Solution / SampleSet sidecar accessors no longer double-count removed constraints. EvaluatedCollection.inner() and SampledCollection.inner() already include removed ids alongside active ones; chaining `.removed_reasons().keys()` was producing duplicate rows in `constraint_metadata_df` / `constraint_parameters_df` / `constraint_provenance_df` for every relaxed constraint. Drop the redundant chain — 24 sites across solution.rs / sample_set.rs. 2. `WithMetadata<EvaluatedDecisionVariable>` and `WithMetadata<WithSampleIds<SampledDecisionVariable>>` now emit `parameters.{key}` columns. Pre-existing gap that this PR surfaced via `include=["parameters"]`: the unevaluated `DecisionVariable` impl already called `set_parameter_columns`, but the evaluated / sampled variants silently dropped the parameter map. 3. `WithSampleIds<SampledNamedFunction>` now emits `parameters.{key}` columns instead of a single bare `parameters` column with `"k=v"`-formatted strings. Brings the shape in line with `NamedFunction` / `EvaluatedNamedFunction`, and lets `apply_include_filter` honour `include=[]` for this accessor. Add four regression tests in test_dataframe_sidecars.py: - Solution / SampleSet sidecar dedup with a relaxed constraint. - Solution / SampleSet decision_variables_df(include=["parameters"]) actually emits the parameter columns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Four small follow-ups surfaced by the self-review pass. 1. **`constraint_kind_collection!` macro** in instance.rs / parametric_instance.rs / solution.rs / sample_set.rs collapses the 16-arm `ConstraintKind` match (4 sidecars × 4 kinds) per host into 4 macro invocations. Net -354 lines across the four files; the body passed to the macro is the only thing that varies between sidecars, and `coll` is rebound to the right collection per kind. 2. **`set_parameter_columns` sorts keys** before emitting the `parameters.{key}` columns. Wide `*_df` parameter columns now match the lexicographic order used by the long-format `*_parameters_df` builders, and become process-stable even when the upstream `Constraint.set_parameters` is given a `std::HashMap` (whose iteration is randomized per process). The helper also turns generic over the hasher so it accepts both `FnvHashMap` (SoA stores) and `std::HashMap` (`v1::Parameter`). 3. **Empty long-format dfs keep the schema.** `long_format_dataframe` now takes an explicit `columns: &[&str]` and passes it to `pd.DataFrame(columns=...)` when `entries` is empty, so empty results are usable in `pd.concat` and column-checking code instead of returning a column-less DataFrame. 4. **`v1::Parameter` ToPandasEntry** uses `set_parameter_columns` instead of an inline `format!("parameters.{key}")` loop, picking up the sort and matching the other 5 sites. Snapshot updates: 3 frames pick up the new lex order on `parameters.role` / `parameters.shard` and the explicit empty schema on `constraint_provenance_df`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Tutorial / user_guide / release-note pages updated for the v3.0.0a3 DataFrame API: - All `*_df` accesses in MyST code-cells now call the method form (`solution.constraints_df()`), matching the property→method conversion that landed in PR #846. - `{attr}` cross-references for `*_df` accessors switch to `{meth}` in tables and prose: `capability_model.md`, `special_constraints.md`, `release_note/ommx-3.0.md` (en + ja). Non-`_df` properties (`removed_one_hot_constraints`, etc.) keep `{attr}`. - "property" → "method" wording fixes in `tutorial/solve_with_ommx_adapter.md` (en + ja) for the prose that describes what `decision_variables_df` / `constraints_df` return. - New "Unreleased" entry on `release_note/ommx-3.0.md` (en + ja) documenting the property→method conversion, the `include=` parameter on the wide `*_df` methods, and the six new long-format / id-indexed sidecar accessors with their `kind=` dispatch. `task python:book:test` builds cleanly with all code-cells executing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The JA instance guide showed `instance.decision_variables` (the list-returning property) where the surrounding prose promised a `pandas.DataFrame`, while the EN counterpart was already correct as `instance.decision_variables_df()`. Pre-existing inconsistency surfaced by the Codex re-review of PR #846. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Same treatment as 1.x — the 2.x release notes pin code samples to API shapes that no longer compile under v3 (e.g. property-style `*_df` accessors, the bytes-on-element-types methods removed in #845). Build the pages but don't try to execute them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Updates the Python (PyO3) pandas-facing surface to support include= filtering on wide *_df accessors, converts *_df properties into methods, and adds long-format / id-indexed “sidecar” DataFrame accessors backed directly by the SoA metadata stores.

Changes:

Add include= (metadata/parameters gating) across wide *_df methods and implement filtering in entries_to_dataframe.
Convert all *_df accessors from #[getter] properties to methods and update in-tree call sites/docs/tests.
Add sidecar DataFrame accessors for constraint/variable metadata, parameters, provenance, and removed reasons (with kind= dispatch for constraint families).

Reviewed changes

Copilot reviewed 36 out of 37 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
python/ommx/src/pandas.rs	Adds `IncludeFlags`, sidecar DataFrame builders, deterministic parameter-column ordering, and `entries_to_dataframe` include-filtering.
python/ommx/src/instance.rs	Converts wide `*_df` accessors to methods with `include=`, and adds sidecar constraint/variable DataFrame accessors.
python/ommx/src/parametric_instance.rs	Converts `*_df` accessors to methods with `include=`, and adds sidecar constraint/variable DataFrame accessors.
python/ommx/src/solution.rs	Converts `*_df` accessors to methods with `include=`, and adds sidecar constraint/variable DataFrame accessors.
python/ommx/src/sample_set.rs	Converts `*_df` accessors to methods with `include=`, and adds sidecar constraint/variable DataFrame accessors.
python/ommx/ommx/_ommx_rust/init.pyi	Updates stubs from properties to methods, adds `include=` and sidecar accessor signatures/docs.
python/ommx-tests/tests/test_dataframe_include.py	Adds tests covering `include=` behavior and validation.
python/ommx-tests/tests/test_dataframe_sidecars.py	Adds tests for all sidecar accessors, kind dispatch, schema stability, and regressions.
python/ommx-tests/tests/snapshots/test_dataframe_include.ambr	Snapshot baselines for `include=` shapes.
python/ommx-tests/tests/snapshots/test_dataframe_sidecars.ambr	Snapshot baselines for sidecar DataFrame shapes.
python/ommx-tests/tests/test_one_hot_sos1_constraints.py	Updates call sites from property access to method calls for `*_df`.
python/ommx-tests/tests/test_indicator_constraint.py	Updates call sites from property access to method calls for `*_df`.
docs/conf_base.py	Expands notebook execution exclude patterns for release notes.
docs/ja/user_guide/instance.md	Updates examples to call `*_df()` methods.
docs/ja/user_guide/parametric_instance.md	Updates examples to call `*_df()` methods.
docs/ja/user_guide/solution.md	Updates examples to call `*_df()` methods.
docs/ja/user_guide/sample_set.md	Updates examples to call `*_df()` methods.
docs/ja/user_guide/special_constraints.md	Updates refs/examples from `{attr}`/property style to `{meth}`/method style.
docs/ja/user_guide/capability_model.md	Updates refs from `{attr}` to `{meth}` for removed-constraints DataFrame accessors.
docs/ja/tutorial/tsp_sampling_with_openjij_adapter.md	Updates examples to call `*_df()` methods.
docs/ja/tutorial/switching_adapters.md	Updates examples to call `*_df()` methods.
docs/ja/tutorial/solve_with_ommx_adapter.md	Updates narrative/examples to call `*_df()` methods.
docs/ja/tutorial/share_in_ommx_artifact.md	Updates examples to call `*_df()` methods.
docs/ja/release_note/ommx-3.0.md	Documents breaking change (methods), `include=`, and new sidecar DataFrames.
docs/en/user_guide/instance.md	Updates examples to call `*_df()` methods.
docs/en/user_guide/parametric_instance.md	Updates examples to call `*_df()` methods.
docs/en/user_guide/solution.md	Updates examples to call `*_df()` methods.
docs/en/user_guide/sample_set.md	Updates examples to call `*_df()` methods.
docs/en/user_guide/special_constraints.md	Updates refs/examples from `{attr}`/property style to `{meth}`/method style.
docs/en/user_guide/capability_model.md	Updates refs from `{attr}` to `{meth}` for removed-constraints DataFrame accessors.
docs/en/tutorial/tsp_sampling_with_openjij_adapter.md	Updates examples to call `*_df()` methods.
docs/en/tutorial/switching_adapters.md	Updates examples to call `*_df()` methods.
docs/en/tutorial/solve_with_ommx_adapter.md	Updates narrative/examples to call `*_df()` methods.
docs/en/tutorial/share_in_ommx_artifact.md	Updates examples to call `*_df()` methods.
docs/en/release_note/ommx-3.0.md	Documents breaking change (methods), `include=`, and new sidecar DataFrames.
METADATA_STORAGE_V3.md	Updates proposal/status text to reflect landed `include=` + sidecar DataFrames and follow-ups.

- Centralize `constraint_kind_collection!` macro in `crate::pandas`. Each host (Instance / ParametricInstance / Solution / SampleSet) now passes its own per-kind accessor names via the macro, so the kind dispatch shape lives in one place. - Fix `{attr}` cross-references that point to `*_df` accessors. Those accessors became methods, so the cross-references should use `{meth}` (8 sites in solution.rs / sample_set.rs). - Update `SampleSet.named_functions_df` docstring to document `parameters.{key}` columns (the implementation already emits them via `set_parameter_columns` — only the docstring claimed a single `parameters` column). Stub regenerate picks up all three changes in `_ommx_rust/__init__.pyi`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…th alpha+PR Rust release note (v3_0_0_alpha_1): - Lead now mentions per-collection metadata stores alongside the collection-level serialization theme. - Removed the stale `metadata: ConstraintMetadata` field from the example `Constraint<S>` struct (matches the actual code). - New "Metadata storage" section covering ConstraintMetadataStore / VariableMetadataStore / NamedFunctionMetadataStore, the per-host accessors, the `pub(crate)` mutator tightening, and the Python-side AttachedX / *_df consequences. Lists the contributing PRs (#843, #846-#850, #852, #853) inline. Python migration guide: - Sections that describe 3.0.0a3 changes get a `(3.0.0aN, [#PR](url))` tag in the heading: §6.6 (method, not property), §9 (DataFrame API), §10 (long-format sidecars), §11 (AttachedX). Single v2 → v3 framing preserved; per-section attribution makes "where did this land?" obvious without splitting the doc by alpha. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

BLOCKING (Rust migration guide): - The "mirrored on Solution / SampleSet" claim for the `*_constraint_metadata()` accessors is wrong — Solution / SampleSet do not flatten constraint metadata to host-level methods. They expose variable / named-function stores at the host (`solution.variable_metadata()`, `sample_set.variable_metadata()`, etc.) but constraint metadata is reached through the evaluated / sampled collection getter and then `.metadata()` on the collection (e.g. `solution.evaluated_constraints().metadata()`, `sample_set.indicator_constraints().metadata()`). Section split into Instance / ParametricInstance vs. Solution / SampleSet examples. NICE-TO-FIX: - §6.6 tag was missing the primary PR. The `parameters` (list) / `parameters_df` (DataFrame) split landed in #774 (3.0.0a1) when ParametricInstance became a Rust re-export; #846 (3.0.0a3) only flipped `_df` from `#[getter]` to method. Tag now lists both alphas with the corresponding PRs and the prose disambiguates. - `VariableMetadataStore` / `NamedFunctionMetadataStore` shape note: not just "provenance omitted". `NamedFunctionMetadataStore` also lacks the `push_subscript` / `extend_subscripts` append helpers (callers use `set_subscripts(id, new_vec)` to extend). Note revised. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

BLOCKING: - Per-element to_bytes / from_bytes paragraph: rewritten. The previous text claimed Constraint / EvaluatedConstraint / SampledConstraint "still expose to_bytes/from_bytes that take an explicit ConstraintID" — these methods do not exist (rg "fn (to_bytes|from_bytes)" in rust/ommx/src returns only the host types and Function). The paragraph now says per-element to_bytes/from_bytes are no longer provided on any constraint kind, motivated by id + metadata living on the host. - "Unified error surface" paragraph: softened from "the public API now returns a single error type" to "the default error type is ommx::Result<T>". The signal types (BoundError, DecisionVariableError, DuplicatedSampleIDError, SubstitutionError, ...) are surfaced typed at specific public APIs (Bound::new, DecisionVariable::new / set_bound / substitute, Sampled::append, the Substitute trait family, ...) — they are not internal-only signals reachable solely via downcast. The list now spells out where each is returned typed. NICE-TO-FIX: - "PropagateOutcome records a Provenance chain" was wrong. The enum has Active(T) / Consumed(T) / Transformed { original, new } — no Provenance field. Provenance is appended by callers writing into ConstraintMetadataStore. Bullet rewritten to describe the actual variants and the caller-side provenance step. - Metadata-storage section over-attributed Python-only PRs as Rust evidence. Rust-side bundle trimmed to (#843, #848, #850, #853); #846, #847, #849, #852 moved into per-bullet citations on the Python-side paragraph. - Tracing observability paragraph softened: it now acknowledges that legacy `.context(...)` call sites in narrow-domain parsers and the artifact layer remain, while new internal fail sites use the crate fail-site macros. Other minor: - "Serialization moves to the host level" wording (was "collection level") and ParametricInstance bullet mentions the sibling VariableMetadataStore / NamedFunctionMetadataStore. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three NICE-TO-FIX items from Codex's fourth review pass: - Python paragraph in §Metadata storage: split the single PR-list paragraph into two bullets so AttachedX (#849, #850, #852) and *_df (#846, #847) attributions stand on their own without the reader needing to cross-check the Python guide. - Capability model wording (§3.0 overview bullet + §Capability model body): "any OMMX Instance can be fed to any adapter" overstated reduce_capabilities' infallibility. It returns Result, e.g. the Big-M encoding requires finite bounds. Both the bullet and the body sentence now say "valid OMMX Instance" and call out the fallible path with Err(ommx::Error). - "Full signal-type list" -> "curated signal-type list" in the cross-reference to the error handling tutorial, matching the tutorial's own framing (it already uses "curated set"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

## Summary Documentation hygiene pass for v3: 1. **Drop `METADATA_STORAGE_V3.md`** — design doc whose waves all landed. 2. **Consolidate Rust release note** into one file (`rust/ommx/doc/release_note.md`) covering the 3.0 line. Drop the per-alpha submodule (`crate::doc::release_note::v3_0_0_alpha_1` removed; `crate::doc::release_note` stays). 3. **Reframe the release note as topic-oriented "what & why"** — the API enumerations live in the Rust migration guide and the error handling tutorial; the release note links to them. Net 387 → 288 lines. 4. **Fix stale `metadata: ConstraintMetadata` references** in the Rust migration guide (those pre-dated the SoA refactor). 5. **Tag every section** of the Python migration guide with `(3.0.0aN, [#PR](url))` annotations and add three new sections covering 3.0.0a3 changes (DataFrame API, sidecar dfs, `AttachedX`). Rust and Python SDKs are versioned independently, so Rust release-note PR refs are bare `[#XXX](url)` and Python-guide tags use Python alpha labels (`3.0.0a1` / `3.0.0a2` / `3.0.0a3`). ## Rust release note (`rust/ommx/doc/release_note.md`) Topic-oriented v3 summary. Detail-heavy parts now defer to the Rust migration guide / tutorial: - `## 3.0` overview bullets — first-class special constraints, Stage / `ConstraintType` collapse, host-level serialization, per-collection SoA metadata stores, capability model, default `ommx::Result<T>` error surface, `v1_ext` removal. - `## Stage parameter and the ConstraintType trait` — motivation only; struct / trait code blocks moved to migration guide §1 and §New Types. - `## Collections and serialization` — `ConstraintCollection<T>` / `EvaluatedCollection<T>` / `SampledCollection<T>` rationale + host-level `to_bytes` / `from_bytes`. Migration guide owns the method lists. - `## Metadata storage: SoA store on the enclosing collection` — three store families (`ConstraintMetadataStore<ID>`, `VariableMetadataStore`, `NamedFunctionMetadataStore`), the `pub(crate)` mutator tightening, and a Python paragraph linking out to `PYTHON_SDK_MIGRATION_GUIDE.md` §9–11 with PR attributions for `AttachedX` (#849, #850, #852) and `*_df` (#846, #847). Rust-side bundle is (#843, #848, #850, #853). - `## Capability model` — `AdditionalCapability` / `Capabilities` / `required_capabilities()` / `reduce_capabilities(&supported)`. - `## Unified error surface` — default `ommx::Result<T>`, fail-site macros (`bail!` / `error!` / `ensure!`), narrow-domain parser exceptions, signal-type representative list. Full catalog deferred to the error handling tutorial. - `## Tracing-first observability` — acknowledges legacy `.context(...)` call sites still exist in `qplib/parser` and `artifact`. - `## Domain types replace v1_ext`, `## Other notable changes` — unchanged. ## Rust migration guide (`rust/ommx/doc/migration_guide.md`) - Remove stale `metadata: ConstraintMetadata` field references on `Constraint<S>` and the struct-construction examples. - New `### Metadata stores` section: per-host accessors on `Instance` / `ParametricInstance` (`constraint_metadata()` / `_mut()`, indicator / one-hot / sos1 variants, `variable_metadata()`, `named_function_metadata()`); how `Solution` / `SampleSet` reach constraint metadata via `evaluated_constraints().metadata()` etc.; the `ConstraintMetadataStore<ID>` API (per-field reads, `collect_for(id)`, write-through setters, bulk `insert` / `remove`); plus the `VariableMetadataStore` / `NamedFunctionMetadataStore` shape differences (no provenance; `NamedFunctionMetadataStore` also lacks `push_subscript` / `extend_subscripts`). - Add `metadata` field to `ConstraintCollection` / `EvaluatedCollection` / `SampledCollection` shape blocks. - Migration-checklist entry for `constraint.name` updated to point at the SoA accessors. ## Python migration guide (`PYTHON_SDK_MIGRATION_GUIDE.md`) - §6.6 wording: every `*_df` is a method. - New §9 — DataFrame API: methods, `kind=` / `include=` / `removed=` consolidating the per-kind / active-vs-removed family. `include=` defaults to `("metadata", "parameters")`; `"removed_reason"` opt-in (auto-enabled with `removed=True` on Instance / ParametricInstance). - New §10 — Long-format sidecar DataFrames (`constraint_metadata_df`, `constraint_provenance_df`, `variable_parameters_df`, …). - New §11 — `instance.constraints[id]` etc. now return `AttachedX` write-through handles. Three-column affected-types table (v2.5.1 / 3.0.0a2 / v3 final) so each hop is visible. `add_*` entry points and `attached.detach()`. - Convenience additions renumbered §9 → §12. - Per-section `(3.0.0aN, [#PR](url))` tag on every section in §1–§11; Overview themes (5, 6) and migration checklist extended. ## Test plan - [ ] `cargo doc -p ommx --no-deps` renders the release note and migration guide cleanly (verified locally). - [ ] Spot-check rendered `ommx::doc::release_note` and `ommx::doc::migration_guide` on the local `target/doc` build for cross-reference resolution and section ordering. - [ ] Spot-check rendered Python migration guide on GitHub: section numbering 1–12, all alpha+PR tags resolve. - [ ] No other doc / source file references `METADATA_STORAGE_V3.md` or `release_note::v3_0_0_alpha_1` (verified locally). 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

termoshtt and others added 10 commits April 28, 2026 14:28

termoshtt marked this pull request as ready for review April 28, 2026 08:36

Copilot AI review requested due to automatic review settings April 28, 2026 08:36

Copilot started reviewing on behalf of termoshtt April 28, 2026 08:37 View session

Copilot AI reviewed Apr 28, 2026

View reviewed changes

termoshtt merged commit 227ff31 into main Apr 28, 2026
30 checks passed

termoshtt deleted the feat/dataframe-include-and-sidecars branch April 28, 2026 09:02

termoshtt added the python Changes in Python SDK label Apr 28, 2026

termoshtt mentioned this pull request May 1, 2026

docs: fold METADATA_STORAGE_V3.md into the migration guides #854

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(pandas): include= and long-format sidecar dataframes#846

feat(pandas): include= and long-format sidecar dataframes#846
termoshtt merged 11 commits intomainfrom
feat/dataframe-include-and-sidecars

termoshtt commented Apr 28, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

termoshtt commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Out of scope

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

termoshtt commented Apr 28, 2026 •

edited

Loading