feat(pandas): include= and long-format sidecar dataframes#846
Merged
feat(pandas): include= and long-format sidecar dataframes#846
Conversation
Reflect the narrow per-collection metadata accessors added on Instance / ParametricInstance during PR #843 review (`*_metadata()` / `*_metadata_mut()`), the explicit policy of not exposing `*_collection_mut()`, and the actual `insert_with(id, c, metadata) -> ()` shape on `ConstraintCollection<T>`. Drop the stale "Parse helpers exposed" claim — those `*_to_v1` helpers are crate-internal and the element-level `from_bytes`/`to_bytes` rationale was removed in #845. Add a Follow-ups section listing the still-`pub` `active_mut`/`removed_mut`/`insert_with`, the deferred `NamedFunction` SoA migration, and the Python Series / `include=` / sidecar wave 2 work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a Sequence[str] include= parameter to the wide DataFrame methods on
Instance / ParametricInstance / Solution / SampleSet. The default
("metadata", "parameters") preserves the v2-equivalent column shape;
include=[] yields core columns only; include=["metadata"] /
include=["parameters"] keep just the named family. Unknown values raise
ValueError.
Implementation: each per-row dict is post-filtered in
entries_to_dataframe based on the resolved IncludeFlags. The
ToPandasEntry impls are unchanged — they still emit every column
unconditionally and the helper drops the gated ones before the
DataFrame is built.
Mechanical breaking change: every `*_df` accessor turns from
#[getter] (property) into a regular method, so call sites must add
parentheses. Existing tests + the one doctest in solution.rs are
updated. removed_reasons_df / *_removed_reasons_df also lose their
property syntax for API consistency, even though they don't take
include= (no metadata/parameters columns to filter).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six new accessors on Instance / ParametricInstance / Solution / SampleSet
that read directly from the SoA metadata stores:
- constraint_metadata_df(kind=...) — id-indexed wide; columns name /
subscripts / description, index = `{kind}_constraint_id`.
- constraint_parameters_df(kind=...) — long format `{kind}_constraint_id,
key, value`. One row per (id, parameter_key) pair; ids without
parameters contribute zero rows.
- constraint_provenance_df(kind=...) — long format
`{kind}_constraint_id, step, source_kind, source_id`. One row per
(id, step) in the provenance chain. Empty for directly-authored
constraints.
- constraint_removed_reasons_df(kind=...) — long format
`{kind}_constraint_id, reason, key, value`. One row per
(id, parameter_key) pair, plus one row with key/value=NA for ids
whose reason has no parameters.
- variable_metadata_df() — id-indexed wide for decision variables;
index = `variable_id`.
- variable_parameters_df() — long format `variable_id, key, value`.
`kind` accepts "regular" / "indicator" / "one_hot" / "sos1" (default
"regular"); unknown values raise ValueError. Index column names are
kind-qualified (`regular_constraint_id` ≠ `indicator_constraint_id`)
so accidental cross-id-space joins are visible in df.head() etc.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update the proposal doc to mark `include=` and the long-format sidecar
dataframes as landed, leaving only the Series accessors, the two-mode
Attached wrappers, and the `kind=Literal[...]` consolidation as
deferred to Wave 2.
Specifically:
- Status line: split out Wave 1.5 from the deferred bucket.
- Python target shape: replace "*_df still v2-style; include=/sidecars
deferred" with the actual landed shape (`*_df()` is a method,
`include=Sequence[str]`, six sidecar accessors).
- Resolved decisions: clarify that `kind=` is `str` today (not
`Literal[...]`), and that `include=("removed_reasons",)` is not
part of the landed include= surface.
- Avoiding cross-ID-space joins: distinguish the qualified index
names that already ship on sidecars from the wide-`*_df` rename
bundled with the deferred consolidation.
- ToPandasEntry restructuring: add the post-filter / sidecar-builder
paths landed in PR #846.
- Breaking changes: new "Landed in PR #846" subsection covering the
property → method conversion plus include= and sidecar shapes;
drop the include=/sidecar bullets from "Deferred to follow-up
wave".
- Follow-ups: add the `kind=Literal[...]` + `@overload`
consolidation and the `include=("removed_reasons",)` pivot as
explicit deferred items.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace explicit per-cell assertions on `*_df()` output with snapshot-based checks. The `.ambr` files are the authoritative description of each method's column / index schema under each configuration; update via `pytest --snapshot-update` after a deliberate API change. Behavioral assertions stay explicit where snapshotting doesn't help: - `test_unknown_include_flag_raises_value_error` / `test_unknown_kind_raises_value_error` (ValueError checks). - `test_solution_constraint_metadata_df_matches_instance` / `test_sample_set_variable_metadata_df_matches_instance` (parity via `pd.testing.assert_frame_equal`). Also sort long-format `*_parameters_df` and `*_removed_reasons_df` rows by `(id, key)` in the Rust builder. The upstream `Constraint.set_parameters` accepts `std::HashMap<String, String>`, whose iteration order is randomized per process — without an explicit sort the snapshot would flap between runs. Sorted output is also more useful for human readers of the DataFrame. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1. Solution / SampleSet sidecar accessors no longer double-count
removed constraints. EvaluatedCollection.inner() and
SampledCollection.inner() already include removed ids alongside
active ones; chaining `.removed_reasons().keys()` was producing
duplicate rows in `constraint_metadata_df` /
`constraint_parameters_df` / `constraint_provenance_df` for every
relaxed constraint. Drop the redundant chain — 24 sites across
solution.rs / sample_set.rs.
2. `WithMetadata<EvaluatedDecisionVariable>` and
`WithMetadata<WithSampleIds<SampledDecisionVariable>>` now emit
`parameters.{key}` columns. Pre-existing gap that this PR
surfaced via `include=["parameters"]`: the unevaluated
`DecisionVariable` impl already called `set_parameter_columns`,
but the evaluated / sampled variants silently dropped the
parameter map.
3. `WithSampleIds<SampledNamedFunction>` now emits
`parameters.{key}` columns instead of a single bare
`parameters` column with `"k=v"`-formatted strings. Brings the
shape in line with `NamedFunction` / `EvaluatedNamedFunction`,
and lets `apply_include_filter` honour `include=[]` for this
accessor.
Add four regression tests in test_dataframe_sidecars.py:
- Solution / SampleSet sidecar dedup with a relaxed constraint.
- Solution / SampleSet decision_variables_df(include=["parameters"])
actually emits the parameter columns.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four small follow-ups surfaced by the self-review pass.
1. **`constraint_kind_collection!` macro** in instance.rs /
parametric_instance.rs / solution.rs / sample_set.rs collapses the
16-arm `ConstraintKind` match (4 sidecars × 4 kinds) per host into 4
macro invocations. Net -354 lines across the four files; the body
passed to the macro is the only thing that varies between sidecars,
and `coll` is rebound to the right collection per kind.
2. **`set_parameter_columns` sorts keys** before emitting the
`parameters.{key}` columns. Wide `*_df` parameter columns now match
the lexicographic order used by the long-format `*_parameters_df`
builders, and become process-stable even when the upstream
`Constraint.set_parameters` is given a `std::HashMap` (whose
iteration is randomized per process). The helper also turns generic
over the hasher so it accepts both `FnvHashMap` (SoA stores) and
`std::HashMap` (`v1::Parameter`).
3. **Empty long-format dfs keep the schema.** `long_format_dataframe`
now takes an explicit `columns: &[&str]` and passes it to
`pd.DataFrame(columns=...)` when `entries` is empty, so empty
results are usable in `pd.concat` and column-checking code instead
of returning a column-less DataFrame.
4. **`v1::Parameter` ToPandasEntry** uses `set_parameter_columns`
instead of an inline `format!("parameters.{key}")` loop, picking up
the sort and matching the other 5 sites.
Snapshot updates: 3 frames pick up the new lex order on
`parameters.role` / `parameters.shard` and the explicit empty schema
on `constraint_provenance_df`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tutorial / user_guide / release-note pages updated for the v3.0.0a3 DataFrame API: - All `*_df` accesses in MyST code-cells now call the method form (`solution.constraints_df()`), matching the property→method conversion that landed in PR #846. - `{attr}` cross-references for `*_df` accessors switch to `{meth}` in tables and prose: `capability_model.md`, `special_constraints.md`, `release_note/ommx-3.0.md` (en + ja). Non-`_df` properties (`removed_one_hot_constraints`, etc.) keep `{attr}`. - "property" → "method" wording fixes in `tutorial/solve_with_ommx_adapter.md` (en + ja) for the prose that describes what `decision_variables_df` / `constraints_df` return. - New "Unreleased" entry on `release_note/ommx-3.0.md` (en + ja) documenting the property→method conversion, the `include=` parameter on the wide `*_df` methods, and the six new long-format / id-indexed sidecar accessors with their `kind=` dispatch. `task python:book:test` builds cleanly with all code-cells executing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The JA instance guide showed `instance.decision_variables` (the list-returning property) where the surrounding prose promised a `pandas.DataFrame`, while the EN counterpart was already correct as `instance.decision_variables_df()`. Pre-existing inconsistency surfaced by the Codex re-review of PR #846. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same treatment as 1.x — the 2.x release notes pin code samples to API shapes that no longer compile under v3 (e.g. property-style `*_df` accessors, the bytes-on-element-types methods removed in #845). Build the pages but don't try to execute them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Updates the Python (PyO3) pandas-facing surface to support include= filtering on wide *_df accessors, converts *_df properties into methods, and adds long-format / id-indexed “sidecar” DataFrame accessors backed directly by the SoA metadata stores.
Changes:
- Add
include=(metadata/parameters gating) across wide*_dfmethods and implement filtering inentries_to_dataframe. - Convert all
*_dfaccessors from#[getter]properties to methods and update in-tree call sites/docs/tests. - Add sidecar DataFrame accessors for constraint/variable metadata, parameters, provenance, and removed reasons (with
kind=dispatch for constraint families).
Reviewed changes
Copilot reviewed 36 out of 37 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| python/ommx/src/pandas.rs | Adds IncludeFlags, sidecar DataFrame builders, deterministic parameter-column ordering, and entries_to_dataframe include-filtering. |
| python/ommx/src/instance.rs | Converts wide *_df accessors to methods with include=, and adds sidecar constraint/variable DataFrame accessors. |
| python/ommx/src/parametric_instance.rs | Converts *_df accessors to methods with include=, and adds sidecar constraint/variable DataFrame accessors. |
| python/ommx/src/solution.rs | Converts *_df accessors to methods with include=, and adds sidecar constraint/variable DataFrame accessors. |
| python/ommx/src/sample_set.rs | Converts *_df accessors to methods with include=, and adds sidecar constraint/variable DataFrame accessors. |
| python/ommx/ommx/_ommx_rust/init.pyi | Updates stubs from properties to methods, adds include= and sidecar accessor signatures/docs. |
| python/ommx-tests/tests/test_dataframe_include.py | Adds tests covering include= behavior and validation. |
| python/ommx-tests/tests/test_dataframe_sidecars.py | Adds tests for all sidecar accessors, kind dispatch, schema stability, and regressions. |
| python/ommx-tests/tests/snapshots/test_dataframe_include.ambr | Snapshot baselines for include= shapes. |
| python/ommx-tests/tests/snapshots/test_dataframe_sidecars.ambr | Snapshot baselines for sidecar DataFrame shapes. |
| python/ommx-tests/tests/test_one_hot_sos1_constraints.py | Updates call sites from property access to method calls for *_df. |
| python/ommx-tests/tests/test_indicator_constraint.py | Updates call sites from property access to method calls for *_df. |
| docs/conf_base.py | Expands notebook execution exclude patterns for release notes. |
| docs/ja/user_guide/instance.md | Updates examples to call *_df() methods. |
| docs/ja/user_guide/parametric_instance.md | Updates examples to call *_df() methods. |
| docs/ja/user_guide/solution.md | Updates examples to call *_df() methods. |
| docs/ja/user_guide/sample_set.md | Updates examples to call *_df() methods. |
| docs/ja/user_guide/special_constraints.md | Updates refs/examples from {attr}/property style to {meth}/method style. |
| docs/ja/user_guide/capability_model.md | Updates refs from {attr} to {meth} for removed-constraints DataFrame accessors. |
| docs/ja/tutorial/tsp_sampling_with_openjij_adapter.md | Updates examples to call *_df() methods. |
| docs/ja/tutorial/switching_adapters.md | Updates examples to call *_df() methods. |
| docs/ja/tutorial/solve_with_ommx_adapter.md | Updates narrative/examples to call *_df() methods. |
| docs/ja/tutorial/share_in_ommx_artifact.md | Updates examples to call *_df() methods. |
| docs/ja/release_note/ommx-3.0.md | Documents breaking change (methods), include=, and new sidecar DataFrames. |
| docs/en/user_guide/instance.md | Updates examples to call *_df() methods. |
| docs/en/user_guide/parametric_instance.md | Updates examples to call *_df() methods. |
| docs/en/user_guide/solution.md | Updates examples to call *_df() methods. |
| docs/en/user_guide/sample_set.md | Updates examples to call *_df() methods. |
| docs/en/user_guide/special_constraints.md | Updates refs/examples from {attr}/property style to {meth}/method style. |
| docs/en/user_guide/capability_model.md | Updates refs from {attr} to {meth} for removed-constraints DataFrame accessors. |
| docs/en/tutorial/tsp_sampling_with_openjij_adapter.md | Updates examples to call *_df() methods. |
| docs/en/tutorial/switching_adapters.md | Updates examples to call *_df() methods. |
| docs/en/tutorial/solve_with_ommx_adapter.md | Updates narrative/examples to call *_df() methods. |
| docs/en/tutorial/share_in_ommx_artifact.md | Updates examples to call *_df() methods. |
| docs/en/release_note/ommx-3.0.md | Documents breaking change (methods), include=, and new sidecar DataFrames. |
| METADATA_STORAGE_V3.md | Updates proposal/status text to reflect landed include= + sidecar DataFrames and follow-ups. |
- Centralize `constraint_kind_collection!` macro in `crate::pandas`. Each
host (Instance / ParametricInstance / Solution / SampleSet) now passes
its own per-kind accessor names via the macro, so the kind dispatch
shape lives in one place.
- Fix `{attr}` cross-references that point to `*_df` accessors. Those
accessors became methods, so the cross-references should use `{meth}`
(8 sites in solution.rs / sample_set.rs).
- Update `SampleSet.named_functions_df` docstring to document
`parameters.{key}` columns (the implementation already emits them via
`set_parameter_columns` — only the docstring claimed a single
`parameters` column).
Stub regenerate picks up all three changes in `_ommx_rust/__init__.pyi`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
termoshtt
added a commit
that referenced
this pull request
May 1, 2026
…th alpha+PR Rust release note (v3_0_0_alpha_1): - Lead now mentions per-collection metadata stores alongside the collection-level serialization theme. - Removed the stale `metadata: ConstraintMetadata` field from the example `Constraint<S>` struct (matches the actual code). - New "Metadata storage" section covering ConstraintMetadataStore / VariableMetadataStore / NamedFunctionMetadataStore, the per-host accessors, the `pub(crate)` mutator tightening, and the Python-side AttachedX / *_df consequences. Lists the contributing PRs (#843, #846-#850, #852, #853) inline. Python migration guide: - Sections that describe 3.0.0a3 changes get a `(3.0.0aN, [#PR](url))` tag in the heading: §6.6 (method, not property), §9 (DataFrame API), §10 (long-format sidecars), §11 (AttachedX). Single v2 → v3 framing preserved; per-section attribution makes "where did this land?" obvious without splitting the doc by alpha. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
termoshtt
added a commit
that referenced
this pull request
May 1, 2026
BLOCKING (Rust migration guide): - The "mirrored on Solution / SampleSet" claim for the `*_constraint_metadata()` accessors is wrong — Solution / SampleSet do not flatten constraint metadata to host-level methods. They expose variable / named-function stores at the host (`solution.variable_metadata()`, `sample_set.variable_metadata()`, etc.) but constraint metadata is reached through the evaluated / sampled collection getter and then `.metadata()` on the collection (e.g. `solution.evaluated_constraints().metadata()`, `sample_set.indicator_constraints().metadata()`). Section split into Instance / ParametricInstance vs. Solution / SampleSet examples. NICE-TO-FIX: - §6.6 tag was missing the primary PR. The `parameters` (list) / `parameters_df` (DataFrame) split landed in #774 (3.0.0a1) when ParametricInstance became a Rust re-export; #846 (3.0.0a3) only flipped `_df` from `#[getter]` to method. Tag now lists both alphas with the corresponding PRs and the prose disambiguates. - `VariableMetadataStore` / `NamedFunctionMetadataStore` shape note: not just "provenance omitted". `NamedFunctionMetadataStore` also lacks the `push_subscript` / `extend_subscripts` append helpers (callers use `set_subscripts(id, new_vec)` to extend). Note revised. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
termoshtt
added a commit
that referenced
this pull request
May 1, 2026
BLOCKING:
- Per-element to_bytes / from_bytes paragraph: rewritten. The previous
text claimed Constraint / EvaluatedConstraint / SampledConstraint
"still expose to_bytes/from_bytes that take an explicit ConstraintID"
— these methods do not exist (rg "fn (to_bytes|from_bytes)" in
rust/ommx/src returns only the host types and Function). The
paragraph now says per-element to_bytes/from_bytes are no longer
provided on any constraint kind, motivated by id + metadata living
on the host.
- "Unified error surface" paragraph: softened from "the public API now
returns a single error type" to "the default error type is
ommx::Result<T>". The signal types (BoundError,
DecisionVariableError, DuplicatedSampleIDError, SubstitutionError,
...) are surfaced typed at specific public APIs (Bound::new,
DecisionVariable::new / set_bound / substitute, Sampled::append, the
Substitute trait family, ...) — they are not internal-only signals
reachable solely via downcast. The list now spells out where each is
returned typed.
NICE-TO-FIX:
- "PropagateOutcome records a Provenance chain" was wrong. The enum
has Active(T) / Consumed(T) / Transformed { original, new } — no
Provenance field. Provenance is appended by callers writing into
ConstraintMetadataStore. Bullet rewritten to describe the actual
variants and the caller-side provenance step.
- Metadata-storage section over-attributed Python-only PRs as Rust
evidence. Rust-side bundle trimmed to (#843, #848, #850, #853);
#846, #847, #849, #852 moved into per-bullet citations on the
Python-side paragraph.
- Tracing observability paragraph softened: it now acknowledges that
legacy `.context(...)` call sites in narrow-domain parsers and the
artifact layer remain, while new internal fail sites use the crate
fail-site macros.
Other minor:
- "Serialization moves to the host level" wording (was "collection
level") and ParametricInstance bullet mentions the sibling
VariableMetadataStore / NamedFunctionMetadataStore.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
termoshtt
added a commit
that referenced
this pull request
May 1, 2026
Three NICE-TO-FIX items from Codex's fourth review pass: - Python paragraph in §Metadata storage: split the single PR-list paragraph into two bullets so AttachedX (#849, #850, #852) and *_df (#846, #847) attributions stand on their own without the reader needing to cross-check the Python guide. - Capability model wording (§3.0 overview bullet + §Capability model body): "any OMMX Instance can be fed to any adapter" overstated reduce_capabilities' infallibility. It returns Result, e.g. the Big-M encoding requires finite bounds. Both the bullet and the body sentence now say "valid OMMX Instance" and call out the fallible path with Err(ommx::Error). - "Full signal-type list" -> "curated signal-type list" in the cross-reference to the error handling tutorial, matching the tutorial's own framing (it already uses "curated set"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
termoshtt
added a commit
that referenced
this pull request
May 1, 2026
## Summary Documentation hygiene pass for v3: 1. **Drop `METADATA_STORAGE_V3.md`** — design doc whose waves all landed. 2. **Consolidate Rust release note** into one file (`rust/ommx/doc/release_note.md`) covering the 3.0 line. Drop the per-alpha submodule (`crate::doc::release_note::v3_0_0_alpha_1` removed; `crate::doc::release_note` stays). 3. **Reframe the release note as topic-oriented "what & why"** — the API enumerations live in the Rust migration guide and the error handling tutorial; the release note links to them. Net 387 → 288 lines. 4. **Fix stale `metadata: ConstraintMetadata` references** in the Rust migration guide (those pre-dated the SoA refactor). 5. **Tag every section** of the Python migration guide with `(3.0.0aN, [#PR](url))` annotations and add three new sections covering 3.0.0a3 changes (DataFrame API, sidecar dfs, `AttachedX`). Rust and Python SDKs are versioned independently, so Rust release-note PR refs are bare `[#XXX](url)` and Python-guide tags use Python alpha labels (`3.0.0a1` / `3.0.0a2` / `3.0.0a3`). ## Rust release note (`rust/ommx/doc/release_note.md`) Topic-oriented v3 summary. Detail-heavy parts now defer to the Rust migration guide / tutorial: - `## 3.0` overview bullets — first-class special constraints, Stage / `ConstraintType` collapse, host-level serialization, per-collection SoA metadata stores, capability model, default `ommx::Result<T>` error surface, `v1_ext` removal. - `## Stage parameter and the ConstraintType trait` — motivation only; struct / trait code blocks moved to migration guide §1 and §New Types. - `## Collections and serialization` — `ConstraintCollection<T>` / `EvaluatedCollection<T>` / `SampledCollection<T>` rationale + host-level `to_bytes` / `from_bytes`. Migration guide owns the method lists. - `## Metadata storage: SoA store on the enclosing collection` — three store families (`ConstraintMetadataStore<ID>`, `VariableMetadataStore`, `NamedFunctionMetadataStore`), the `pub(crate)` mutator tightening, and a Python paragraph linking out to `PYTHON_SDK_MIGRATION_GUIDE.md` §9–11 with PR attributions for `AttachedX` (#849, #850, #852) and `*_df` (#846, #847). Rust-side bundle is (#843, #848, #850, #853). - `## Capability model` — `AdditionalCapability` / `Capabilities` / `required_capabilities()` / `reduce_capabilities(&supported)`. - `## Unified error surface` — default `ommx::Result<T>`, fail-site macros (`bail!` / `error!` / `ensure!`), narrow-domain parser exceptions, signal-type representative list. Full catalog deferred to the error handling tutorial. - `## Tracing-first observability` — acknowledges legacy `.context(...)` call sites still exist in `qplib/parser` and `artifact`. - `## Domain types replace v1_ext`, `## Other notable changes` — unchanged. ## Rust migration guide (`rust/ommx/doc/migration_guide.md`) - Remove stale `metadata: ConstraintMetadata` field references on `Constraint<S>` and the struct-construction examples. - New `### Metadata stores` section: per-host accessors on `Instance` / `ParametricInstance` (`constraint_metadata()` / `_mut()`, indicator / one-hot / sos1 variants, `variable_metadata()`, `named_function_metadata()`); how `Solution` / `SampleSet` reach constraint metadata via `evaluated_constraints().metadata()` etc.; the `ConstraintMetadataStore<ID>` API (per-field reads, `collect_for(id)`, write-through setters, bulk `insert` / `remove`); plus the `VariableMetadataStore` / `NamedFunctionMetadataStore` shape differences (no provenance; `NamedFunctionMetadataStore` also lacks `push_subscript` / `extend_subscripts`). - Add `metadata` field to `ConstraintCollection` / `EvaluatedCollection` / `SampledCollection` shape blocks. - Migration-checklist entry for `constraint.name` updated to point at the SoA accessors. ## Python migration guide (`PYTHON_SDK_MIGRATION_GUIDE.md`) - §6.6 wording: every `*_df` is a method. - New §9 — DataFrame API: methods, `kind=` / `include=` / `removed=` consolidating the per-kind / active-vs-removed family. `include=` defaults to `("metadata", "parameters")`; `"removed_reason"` opt-in (auto-enabled with `removed=True` on Instance / ParametricInstance). - New §10 — Long-format sidecar DataFrames (`constraint_metadata_df`, `constraint_provenance_df`, `variable_parameters_df`, …). - New §11 — `instance.constraints[id]` etc. now return `AttachedX` write-through handles. Three-column affected-types table (v2.5.1 / 3.0.0a2 / v3 final) so each hop is visible. `add_*` entry points and `attached.detach()`. - Convenience additions renumbered §9 → §12. - Per-section `(3.0.0aN, [#PR](url))` tag on every section in §1–§11; Overview themes (5, 6) and migration checklist extended. ## Test plan - [ ] `cargo doc -p ommx --no-deps` renders the release note and migration guide cleanly (verified locally). - [ ] Spot-check rendered `ommx::doc::release_note` and `ommx::doc::migration_guide` on the local `target/doc` build for cross-reference resolution and section ordering. - [ ] Spot-check rendered Python migration guide on GitHub: section numbering 1–12, all alpha+PR tags resolve. - [ ] No other doc / source file references `METADATA_STORAGE_V3.md` or `release_note::v3_0_0_alpha_1` (verified locally). 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wave 1.5 of the v3 alpha pandas surface, scoped to
include=filtering plus long-format / id-indexed sidecar DataFrames. Thekind=Literal[...]consolidation of today's per-kind*_dfmethods stays out of scope for this PR.include=parameter on every wide*_dfaccessor acrossInstance/ParametricInstance/Solution/SampleSet. Default("metadata", "parameters")matches the v2-equivalent shape;include=[]drops both,include=["metadata"]/include=["parameters"]keep just the named family. Unknown values raiseValueError. Implementation post-filters per-row dicts inentries_to_dataframebased on aIncludeFlagsderived from the argument — theToPandasEntryimpls themselves are unchanged.Property → method conversion for every
*_dfaccessor. v3-alpha breaking change:instance.constraints_df→instance.constraints_df(). The existing 11 in-tree call sites, the one doctest insolution.rs, and the Sphinx book (en + ja) are all updated.removed_reasons_df/*_removed_reasons_dfalso lose the property syntax for API consistency, even though they don't takeinclude=(no metadata / parameters columns to filter).Long-format / id-indexed sidecar DataFrames on Instance / ParametricInstance / Solution / SampleSet, reading directly from the SoA metadata stores:
constraint_metadata_df(kind=...)— id-indexed wide; columnsname/subscripts/description, index ={kind}_constraint_id.constraint_parameters_df(kind=...)— long format ({kind}_constraint_id,key,value). One row per (id, parameter_key) pair.constraint_provenance_df(kind=...)— long format ({kind}_constraint_id,step,source_kind,source_id). One row per (id, step) in the provenance chain.constraint_removed_reasons_df(kind=...)— long format ({kind}_constraint_id,reason,key,value). One row per (id, parameter_key) pair, plus one row withkey/value= NA for ids whose reason has no parameters.variable_metadata_df()— id-indexed wide; index =variable_id.variable_parameters_df()— long format (variable_id,key,value).kindaccepts"regular"/"indicator"/"one_hot"/"sos1"(default"regular"); unknown values raiseValueError. Index column names are kind-qualified (regular_constraint_id≠indicator_constraint_id) so accidental cross-id-space joins surface indf.head()/df.info(). Long-format*_parameters_df/*_removed_reasons_dfrows are sorted by(id, key), and empty long-format DataFrames keep their column schema instead of returning a column-less frame.Macro refactor of the per-host kind dispatch. Each of the four hosts gets a
constraint_kind_collection!macro that collapses the 16-armConstraintKindmatch (4 sidecars × 4 kinds) into 4 macro invocations. Net −354 lines across instance.rs / parametric_instance.rs / solution.rs / sample_set.rs.set_parameter_columnssorts keys before emittingparameters.{key}columns, so wide-*_dfparameter column order matches the long-format sort and stays process-stable even when the upstreamConstraint.set_parametersis given astd::HashMap(whose iteration is randomized per process). The helper is generic over the hasher so it accepts bothFnvHashMap(SoA stores) andstd::HashMap(v1::Parameter).Codex review pass caught three correctness bugs that are now fixed:
EvaluatedCollection.inner()already includes them — chaining.removed_reasons().keys()produced duplicate rows).WithMetadata<EvaluatedDecisionVariable>/WithMetadata<WithSampleIds<SampledDecisionVariable>>now emitparameters.{key}columns, soSolution.decision_variables_df(include=["parameters"])and the SampleSet equivalent return the right data.WithSampleIds<SampledNamedFunction>emitsparameters.{key}columns instead of a single bareparametersstring column, soapply_include_filterhonoursinclude=[]here too.Book updates (
docs/en/**/*.md+docs/ja/**/*.md): tutorial / user_guide / release-note pages flip to the method form,{attr}cross-references for*_dfswitch to{meth}, and a new "Unreleased" entry onrelease_note/ommx-3.0.mdcovers the property → method conversion,include=, and the six new sidecar accessors. 2.x release notes are added tonb_execution_excludepatterns(same treatment as 1.x — historical code samples reference removed APIs).Proposal doc (
METADATA_STORAGE_V3.md): reflects the narrow per-collection metadata accessors landed in feat(metadata-soa): move per-element metadata onto collection-level SoA stores #843, the explicit "no*_collection_mut()" policy, the actualinsert_with(id, c, metadata) -> ()shape, and now also the Wave 1.5 landing —include=and the sidecar accessors are marked landed (PR feat(pandas): include= and long-format sidecar dataframes #846), with the deferredkind=Literal[...]consolidation, Series accessors, and Attached wrappers spelled out as the remaining Wave 2 work.Out of scope
constraints_df(kind=Literal["regular", "indicator", "one_hot", "sos1"])consolidating today's per-kind*_constraints_dffamily — separate PR. This PR keeps the per-kind methods as-is.pandas.Series[ID -> Object]collection accessors and the Standalone / Attached two-mode wrappers — Wave 2 ofMETADATA_STORAGE_V3.md.NamedFunctionSoA migration.Test plan
task python:testpasses locally with the full PR.task python:book:testrebuilds en + ja docs cleanly with all code-cells executing.python/ommx-tests/tests/test_dataframe_include.py— 9 cases covering default vs. empty vs. metadata-only vs. parameters-only onInstance.decision_variables_df,Instance.constraints_df,Solution.decision_variables_df,SampleSet.decision_variables_df, plus an unknown-flagValueErrorcheck. Snapshots written to__snapshots__/test_dataframe_include.ambr.python/ommx-tests/tests/test_dataframe_sidecars.py— 16 cases covering all 6 sidecar accessors (id-indexed shape, long-format shape,kind=dispatch on every constraint family, qualified index names, empty-provenance baseline, provenance afterconvert_one_hot_to_constraint, removed-reasons afterrelax_constraint, removed-reasons with parameters after one-hot conversion, Solution / SampleSet parity with Instance) plus 4 regression tests for the Codex fixes (Solution / SampleSet sidecar dedup with a relaxed constraint, Solution / SampleSetdecision_variables_df(include=["parameters"])actually emits the columns). Snapshots in__snapshots__/test_dataframe_sidecars.ambr.pytestruns all pass — confirms determinism after the long-format(id, key)sort and the alphabetised wideparameters.{key}order.instance.mdalready in place before this PR but surfaced by the docs review) and two non-blocking notes that are scoped out (pd.DataFrame(columns=[...])producesobjectdtypes for the empty-frame case; historical 2.x release-note prose references*_dfwithout parens — kept as name reference, not invocation).🤖 Generated with Claude Code