Skip to content

feat(pandas): include= and long-format sidecar dataframes#846

Merged
termoshtt merged 11 commits intomainfrom
feat/dataframe-include-and-sidecars
Apr 28, 2026
Merged

feat(pandas): include= and long-format sidecar dataframes#846
termoshtt merged 11 commits intomainfrom
feat/dataframe-include-and-sidecars

Conversation

@termoshtt
Copy link
Copy Markdown
Member

@termoshtt termoshtt commented Apr 28, 2026

Summary

Wave 1.5 of the v3 alpha pandas surface, scoped to include= filtering plus long-format / id-indexed sidecar DataFrames. The kind=Literal[...] consolidation of today's per-kind *_df methods stays out of scope for this PR.

  • include= parameter on every wide *_df accessor across Instance / ParametricInstance / Solution / SampleSet. Default ("metadata", "parameters") matches the v2-equivalent shape; include=[] drops both, include=["metadata"] / include=["parameters"] keep just the named family. Unknown values raise ValueError. Implementation post-filters per-row dicts in entries_to_dataframe based on a IncludeFlags derived from the argument — the ToPandasEntry impls themselves are unchanged.

  • Property → method conversion for every *_df accessor. v3-alpha breaking change: instance.constraints_dfinstance.constraints_df(). The existing 11 in-tree call sites, the one doctest in solution.rs, and the Sphinx book (en + ja) are all updated. removed_reasons_df / *_removed_reasons_df also lose the property syntax for API consistency, even though they don't take include= (no metadata / parameters columns to filter).

  • Long-format / id-indexed sidecar DataFrames on Instance / ParametricInstance / Solution / SampleSet, reading directly from the SoA metadata stores:

    • constraint_metadata_df(kind=...) — id-indexed wide; columns name / subscripts / description, index = {kind}_constraint_id.
    • constraint_parameters_df(kind=...) — long format ({kind}_constraint_id, key, value). One row per (id, parameter_key) pair.
    • constraint_provenance_df(kind=...) — long format ({kind}_constraint_id, step, source_kind, source_id). One row per (id, step) in the provenance chain.
    • constraint_removed_reasons_df(kind=...) — long format ({kind}_constraint_id, reason, key, value). One row per (id, parameter_key) pair, plus one row with key / value = NA for ids whose reason has no parameters.
    • variable_metadata_df() — id-indexed wide; index = variable_id.
    • variable_parameters_df() — long format (variable_id, key, value).

    kind accepts "regular" / "indicator" / "one_hot" / "sos1" (default "regular"); unknown values raise ValueError. Index column names are kind-qualified (regular_constraint_idindicator_constraint_id) so accidental cross-id-space joins surface in df.head() / df.info(). Long-format *_parameters_df / *_removed_reasons_df rows are sorted by (id, key), and empty long-format DataFrames keep their column schema instead of returning a column-less frame.

  • Macro refactor of the per-host kind dispatch. Each of the four hosts gets a constraint_kind_collection! macro that collapses the 16-arm ConstraintKind match (4 sidecars × 4 kinds) into 4 macro invocations. Net −354 lines across instance.rs / parametric_instance.rs / solution.rs / sample_set.rs.

  • set_parameter_columns sorts keys before emitting parameters.{key} columns, so wide-*_df parameter column order matches the long-format sort and stays process-stable even when the upstream Constraint.set_parameters is given a std::HashMap (whose iteration is randomized per process). The helper is generic over the hasher so it accepts both FnvHashMap (SoA stores) and std::HashMap (v1::Parameter).

  • Codex review pass caught three correctness bugs that are now fixed:

    • Solution / SampleSet sidecar accessors no longer double-count removed constraints (EvaluatedCollection.inner() already includes them — chaining .removed_reasons().keys() produced duplicate rows).
    • WithMetadata<EvaluatedDecisionVariable> / WithMetadata<WithSampleIds<SampledDecisionVariable>> now emit parameters.{key} columns, so Solution.decision_variables_df(include=["parameters"]) and the SampleSet equivalent return the right data.
    • WithSampleIds<SampledNamedFunction> emits parameters.{key} columns instead of a single bare parameters string column, so apply_include_filter honours include=[] here too.
  • Book updates (docs/en/**/*.md + docs/ja/**/*.md): tutorial / user_guide / release-note pages flip to the method form, {attr} cross-references for *_df switch to {meth}, and a new "Unreleased" entry on release_note/ommx-3.0.md covers the property → method conversion, include=, and the six new sidecar accessors. 2.x release notes are added to nb_execution_excludepatterns (same treatment as 1.x — historical code samples reference removed APIs).

  • Proposal doc (METADATA_STORAGE_V3.md): reflects the narrow per-collection metadata accessors landed in feat(metadata-soa): move per-element metadata onto collection-level SoA stores #843, the explicit "no *_collection_mut()" policy, the actual insert_with(id, c, metadata) -> () shape, and now also the Wave 1.5 landing — include= and the sidecar accessors are marked landed (PR feat(pandas): include= and long-format sidecar dataframes #846), with the deferred kind=Literal[...] consolidation, Series accessors, and Attached wrappers spelled out as the remaining Wave 2 work.

Out of scope

  • constraints_df(kind=Literal["regular", "indicator", "one_hot", "sos1"]) consolidating today's per-kind *_constraints_df family — separate PR. This PR keeps the per-kind methods as-is.
  • pandas.Series[ID -> Object] collection accessors and the Standalone / Attached two-mode wrappers — Wave 2 of METADATA_STORAGE_V3.md.
  • NamedFunction SoA migration.

Test plan

  • task python:test passes locally with the full PR.
  • task python:book:test rebuilds en + ja docs cleanly with all code-cells executing.
  • Snapshot-based python/ommx-tests/tests/test_dataframe_include.py — 9 cases covering default vs. empty vs. metadata-only vs. parameters-only on Instance.decision_variables_df, Instance.constraints_df, Solution.decision_variables_df, SampleSet.decision_variables_df, plus an unknown-flag ValueError check. Snapshots written to __snapshots__/test_dataframe_include.ambr.
  • Snapshot-based python/ommx-tests/tests/test_dataframe_sidecars.py — 16 cases covering all 6 sidecar accessors (id-indexed shape, long-format shape, kind= dispatch on every constraint family, qualified index names, empty-provenance baseline, provenance after convert_one_hot_to_constraint, removed-reasons after relax_constraint, removed-reasons with parameters after one-hot conversion, Solution / SampleSet parity with Instance) plus 4 regression tests for the Codex fixes (Solution / SampleSet sidecar dedup with a relaxed constraint, Solution / SampleSet decision_variables_df(include=["parameters"]) actually emits the columns). Snapshots in __snapshots__/test_dataframe_sidecars.ambr.
  • 5 consecutive pytest runs all pass — confirms determinism after the long-format (id, key) sort and the alphabetised wide parameters.{key} order.
  • Two Codex full-PR review passes; the second came back with one MUST-FIX (a JA / EN inconsistency in instance.md already in place before this PR but surfaced by the docs review) and two non-blocking notes that are scoped out (pd.DataFrame(columns=[...]) produces object dtypes for the empty-frame case; historical 2.x release-note prose references *_df without parens — kept as name reference, not invocation).

🤖 Generated with Claude Code

termoshtt and others added 10 commits April 28, 2026 14:28
Reflect the narrow per-collection metadata accessors added on Instance /
ParametricInstance during PR #843 review (`*_metadata()` / `*_metadata_mut()`),
the explicit policy of not exposing `*_collection_mut()`, and the actual
`insert_with(id, c, metadata) -> ()` shape on `ConstraintCollection<T>`.
Drop the stale "Parse helpers exposed" claim — those `*_to_v1` helpers are
crate-internal and the element-level `from_bytes`/`to_bytes` rationale was
removed in #845. Add a Follow-ups section listing the still-`pub`
`active_mut`/`removed_mut`/`insert_with`, the deferred `NamedFunction` SoA
migration, and the Python Series / `include=` / sidecar wave 2 work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a Sequence[str] include= parameter to the wide DataFrame methods on
Instance / ParametricInstance / Solution / SampleSet. The default
("metadata", "parameters") preserves the v2-equivalent column shape;
include=[] yields core columns only; include=["metadata"] /
include=["parameters"] keep just the named family. Unknown values raise
ValueError.

Implementation: each per-row dict is post-filtered in
entries_to_dataframe based on the resolved IncludeFlags. The
ToPandasEntry impls are unchanged — they still emit every column
unconditionally and the helper drops the gated ones before the
DataFrame is built.

Mechanical breaking change: every `*_df` accessor turns from
#[getter] (property) into a regular method, so call sites must add
parentheses. Existing tests + the one doctest in solution.rs are
updated. removed_reasons_df / *_removed_reasons_df also lose their
property syntax for API consistency, even though they don't take
include= (no metadata/parameters columns to filter).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six new accessors on Instance / ParametricInstance / Solution / SampleSet
that read directly from the SoA metadata stores:

- constraint_metadata_df(kind=...) — id-indexed wide; columns name /
  subscripts / description, index = `{kind}_constraint_id`.
- constraint_parameters_df(kind=...) — long format `{kind}_constraint_id,
  key, value`. One row per (id, parameter_key) pair; ids without
  parameters contribute zero rows.
- constraint_provenance_df(kind=...) — long format
  `{kind}_constraint_id, step, source_kind, source_id`. One row per
  (id, step) in the provenance chain. Empty for directly-authored
  constraints.
- constraint_removed_reasons_df(kind=...) — long format
  `{kind}_constraint_id, reason, key, value`. One row per
  (id, parameter_key) pair, plus one row with key/value=NA for ids
  whose reason has no parameters.
- variable_metadata_df() — id-indexed wide for decision variables;
  index = `variable_id`.
- variable_parameters_df() — long format `variable_id, key, value`.

`kind` accepts "regular" / "indicator" / "one_hot" / "sos1" (default
"regular"); unknown values raise ValueError. Index column names are
kind-qualified (`regular_constraint_id` ≠ `indicator_constraint_id`)
so accidental cross-id-space joins are visible in df.head() etc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update the proposal doc to mark `include=` and the long-format sidecar
dataframes as landed, leaving only the Series accessors, the two-mode
Attached wrappers, and the `kind=Literal[...]` consolidation as
deferred to Wave 2.

Specifically:
- Status line: split out Wave 1.5 from the deferred bucket.
- Python target shape: replace "*_df still v2-style; include=/sidecars
  deferred" with the actual landed shape (`*_df()` is a method,
  `include=Sequence[str]`, six sidecar accessors).
- Resolved decisions: clarify that `kind=` is `str` today (not
  `Literal[...]`), and that `include=("removed_reasons",)` is not
  part of the landed include= surface.
- Avoiding cross-ID-space joins: distinguish the qualified index
  names that already ship on sidecars from the wide-`*_df` rename
  bundled with the deferred consolidation.
- ToPandasEntry restructuring: add the post-filter / sidecar-builder
  paths landed in PR #846.
- Breaking changes: new "Landed in PR #846" subsection covering the
  property → method conversion plus include= and sidecar shapes;
  drop the include=/sidecar bullets from "Deferred to follow-up
  wave".
- Follow-ups: add the `kind=Literal[...]` + `@overload`
  consolidation and the `include=("removed_reasons",)` pivot as
  explicit deferred items.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace explicit per-cell assertions on `*_df()` output with
snapshot-based checks. The `.ambr` files are the authoritative
description of each method's column / index schema under each
configuration; update via `pytest --snapshot-update` after a
deliberate API change.

Behavioral assertions stay explicit where snapshotting doesn't help:
- `test_unknown_include_flag_raises_value_error` /
  `test_unknown_kind_raises_value_error` (ValueError checks).
- `test_solution_constraint_metadata_df_matches_instance` /
  `test_sample_set_variable_metadata_df_matches_instance` (parity
  via `pd.testing.assert_frame_equal`).

Also sort long-format `*_parameters_df` and
`*_removed_reasons_df` rows by `(id, key)` in the Rust builder.
The upstream `Constraint.set_parameters` accepts
`std::HashMap<String, String>`, whose iteration order is
randomized per process — without an explicit sort the snapshot
would flap between runs. Sorted output is also more useful for
human readers of the DataFrame.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1. Solution / SampleSet sidecar accessors no longer double-count
   removed constraints. EvaluatedCollection.inner() and
   SampledCollection.inner() already include removed ids alongside
   active ones; chaining `.removed_reasons().keys()` was producing
   duplicate rows in `constraint_metadata_df` /
   `constraint_parameters_df` / `constraint_provenance_df` for every
   relaxed constraint. Drop the redundant chain — 24 sites across
   solution.rs / sample_set.rs.

2. `WithMetadata<EvaluatedDecisionVariable>` and
   `WithMetadata<WithSampleIds<SampledDecisionVariable>>` now emit
   `parameters.{key}` columns. Pre-existing gap that this PR
   surfaced via `include=["parameters"]`: the unevaluated
   `DecisionVariable` impl already called `set_parameter_columns`,
   but the evaluated / sampled variants silently dropped the
   parameter map.

3. `WithSampleIds<SampledNamedFunction>` now emits
   `parameters.{key}` columns instead of a single bare
   `parameters` column with `"k=v"`-formatted strings. Brings the
   shape in line with `NamedFunction` / `EvaluatedNamedFunction`,
   and lets `apply_include_filter` honour `include=[]` for this
   accessor.

Add four regression tests in test_dataframe_sidecars.py:
- Solution / SampleSet sidecar dedup with a relaxed constraint.
- Solution / SampleSet decision_variables_df(include=["parameters"])
  actually emits the parameter columns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four small follow-ups surfaced by the self-review pass.

1. **`constraint_kind_collection!` macro** in instance.rs /
   parametric_instance.rs / solution.rs / sample_set.rs collapses the
   16-arm `ConstraintKind` match (4 sidecars × 4 kinds) per host into 4
   macro invocations. Net -354 lines across the four files; the body
   passed to the macro is the only thing that varies between sidecars,
   and `coll` is rebound to the right collection per kind.

2. **`set_parameter_columns` sorts keys** before emitting the
   `parameters.{key}` columns. Wide `*_df` parameter columns now match
   the lexicographic order used by the long-format `*_parameters_df`
   builders, and become process-stable even when the upstream
   `Constraint.set_parameters` is given a `std::HashMap` (whose
   iteration is randomized per process). The helper also turns generic
   over the hasher so it accepts both `FnvHashMap` (SoA stores) and
   `std::HashMap` (`v1::Parameter`).

3. **Empty long-format dfs keep the schema.** `long_format_dataframe`
   now takes an explicit `columns: &[&str]` and passes it to
   `pd.DataFrame(columns=...)` when `entries` is empty, so empty
   results are usable in `pd.concat` and column-checking code instead
   of returning a column-less DataFrame.

4. **`v1::Parameter` ToPandasEntry** uses `set_parameter_columns`
   instead of an inline `format!("parameters.{key}")` loop, picking up
   the sort and matching the other 5 sites.

Snapshot updates: 3 frames pick up the new lex order on
`parameters.role` / `parameters.shard` and the explicit empty schema
on `constraint_provenance_df`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tutorial / user_guide / release-note pages updated for the v3.0.0a3
DataFrame API:

- All `*_df` accesses in MyST code-cells now call the method form
  (`solution.constraints_df()`), matching the property→method
  conversion that landed in PR #846.
- `{attr}` cross-references for `*_df` accessors switch to `{meth}`
  in tables and prose: `capability_model.md`, `special_constraints.md`,
  `release_note/ommx-3.0.md` (en + ja). Non-`_df` properties
  (`removed_one_hot_constraints`, etc.) keep `{attr}`.
- "property" → "method" wording fixes in
  `tutorial/solve_with_ommx_adapter.md` (en + ja) for the prose that
  describes what `decision_variables_df` / `constraints_df` return.
- New "Unreleased" entry on `release_note/ommx-3.0.md` (en + ja)
  documenting the property→method conversion, the `include=`
  parameter on the wide `*_df` methods, and the six new long-format
  / id-indexed sidecar accessors with their `kind=` dispatch.

`task python:book:test` builds cleanly with all code-cells executing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The JA instance guide showed `instance.decision_variables` (the
list-returning property) where the surrounding prose promised a
`pandas.DataFrame`, while the EN counterpart was already correct as
`instance.decision_variables_df()`. Pre-existing inconsistency surfaced
by the Codex re-review of PR #846.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same treatment as 1.x — the 2.x release notes pin code samples to API
shapes that no longer compile under v3 (e.g. property-style `*_df`
accessors, the bytes-on-element-types methods removed in #845). Build
the pages but don't try to execute them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@termoshtt termoshtt marked this pull request as ready for review April 28, 2026 08:36
Copilot AI review requested due to automatic review settings April 28, 2026 08:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Python (PyO3) pandas-facing surface to support include= filtering on wide *_df accessors, converts *_df properties into methods, and adds long-format / id-indexed “sidecar” DataFrame accessors backed directly by the SoA metadata stores.

Changes:

  • Add include= (metadata/parameters gating) across wide *_df methods and implement filtering in entries_to_dataframe.
  • Convert all *_df accessors from #[getter] properties to methods and update in-tree call sites/docs/tests.
  • Add sidecar DataFrame accessors for constraint/variable metadata, parameters, provenance, and removed reasons (with kind= dispatch for constraint families).

Reviewed changes

Copilot reviewed 36 out of 37 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
python/ommx/src/pandas.rs Adds IncludeFlags, sidecar DataFrame builders, deterministic parameter-column ordering, and entries_to_dataframe include-filtering.
python/ommx/src/instance.rs Converts wide *_df accessors to methods with include=, and adds sidecar constraint/variable DataFrame accessors.
python/ommx/src/parametric_instance.rs Converts *_df accessors to methods with include=, and adds sidecar constraint/variable DataFrame accessors.
python/ommx/src/solution.rs Converts *_df accessors to methods with include=, and adds sidecar constraint/variable DataFrame accessors.
python/ommx/src/sample_set.rs Converts *_df accessors to methods with include=, and adds sidecar constraint/variable DataFrame accessors.
python/ommx/ommx/_ommx_rust/init.pyi Updates stubs from properties to methods, adds include= and sidecar accessor signatures/docs.
python/ommx-tests/tests/test_dataframe_include.py Adds tests covering include= behavior and validation.
python/ommx-tests/tests/test_dataframe_sidecars.py Adds tests for all sidecar accessors, kind dispatch, schema stability, and regressions.
python/ommx-tests/tests/snapshots/test_dataframe_include.ambr Snapshot baselines for include= shapes.
python/ommx-tests/tests/snapshots/test_dataframe_sidecars.ambr Snapshot baselines for sidecar DataFrame shapes.
python/ommx-tests/tests/test_one_hot_sos1_constraints.py Updates call sites from property access to method calls for *_df.
python/ommx-tests/tests/test_indicator_constraint.py Updates call sites from property access to method calls for *_df.
docs/conf_base.py Expands notebook execution exclude patterns for release notes.
docs/ja/user_guide/instance.md Updates examples to call *_df() methods.
docs/ja/user_guide/parametric_instance.md Updates examples to call *_df() methods.
docs/ja/user_guide/solution.md Updates examples to call *_df() methods.
docs/ja/user_guide/sample_set.md Updates examples to call *_df() methods.
docs/ja/user_guide/special_constraints.md Updates refs/examples from {attr}/property style to {meth}/method style.
docs/ja/user_guide/capability_model.md Updates refs from {attr} to {meth} for removed-constraints DataFrame accessors.
docs/ja/tutorial/tsp_sampling_with_openjij_adapter.md Updates examples to call *_df() methods.
docs/ja/tutorial/switching_adapters.md Updates examples to call *_df() methods.
docs/ja/tutorial/solve_with_ommx_adapter.md Updates narrative/examples to call *_df() methods.
docs/ja/tutorial/share_in_ommx_artifact.md Updates examples to call *_df() methods.
docs/ja/release_note/ommx-3.0.md Documents breaking change (methods), include=, and new sidecar DataFrames.
docs/en/user_guide/instance.md Updates examples to call *_df() methods.
docs/en/user_guide/parametric_instance.md Updates examples to call *_df() methods.
docs/en/user_guide/solution.md Updates examples to call *_df() methods.
docs/en/user_guide/sample_set.md Updates examples to call *_df() methods.
docs/en/user_guide/special_constraints.md Updates refs/examples from {attr}/property style to {meth}/method style.
docs/en/user_guide/capability_model.md Updates refs from {attr} to {meth} for removed-constraints DataFrame accessors.
docs/en/tutorial/tsp_sampling_with_openjij_adapter.md Updates examples to call *_df() methods.
docs/en/tutorial/switching_adapters.md Updates examples to call *_df() methods.
docs/en/tutorial/solve_with_ommx_adapter.md Updates narrative/examples to call *_df() methods.
docs/en/tutorial/share_in_ommx_artifact.md Updates examples to call *_df() methods.
docs/en/release_note/ommx-3.0.md Documents breaking change (methods), include=, and new sidecar DataFrames.
METADATA_STORAGE_V3.md Updates proposal/status text to reflect landed include= + sidecar DataFrames and follow-ups.

Comment thread python/ommx/src/instance.rs Outdated
Comment thread python/ommx/src/solution.rs Outdated
Comment thread python/ommx/src/sample_set.rs Outdated
Comment thread python/ommx/src/sample_set.rs Outdated
Comment thread python/ommx/src/sample_set.rs
Comment thread python/ommx/ommx/_ommx_rust/__init__.pyi
Comment thread python/ommx/src/parametric_instance.rs Outdated
Comment thread python/ommx/ommx/_ommx_rust/__init__.pyi
Comment thread python/ommx/ommx/_ommx_rust/__init__.pyi
- Centralize `constraint_kind_collection!` macro in `crate::pandas`. Each
  host (Instance / ParametricInstance / Solution / SampleSet) now passes
  its own per-kind accessor names via the macro, so the kind dispatch
  shape lives in one place.
- Fix `{attr}` cross-references that point to `*_df` accessors. Those
  accessors became methods, so the cross-references should use `{meth}`
  (8 sites in solution.rs / sample_set.rs).
- Update `SampleSet.named_functions_df` docstring to document
  `parameters.{key}` columns (the implementation already emits them via
  `set_parameter_columns` — only the docstring claimed a single
  `parameters` column).

Stub regenerate picks up all three changes in `_ommx_rust/__init__.pyi`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@termoshtt termoshtt merged commit 227ff31 into main Apr 28, 2026
30 checks passed
@termoshtt termoshtt deleted the feat/dataframe-include-and-sidecars branch April 28, 2026 09:02
@termoshtt termoshtt added the python Changes in Python SDK label Apr 28, 2026
termoshtt added a commit that referenced this pull request May 1, 2026
…th alpha+PR

Rust release note (v3_0_0_alpha_1):
- Lead now mentions per-collection metadata stores alongside the
  collection-level serialization theme.
- Removed the stale `metadata: ConstraintMetadata` field from the
  example `Constraint<S>` struct (matches the actual code).
- New "Metadata storage" section covering ConstraintMetadataStore /
  VariableMetadataStore / NamedFunctionMetadataStore, the per-host
  accessors, the `pub(crate)` mutator tightening, and the Python-side
  AttachedX / *_df consequences. Lists the contributing PRs (#843,
  #846-#850, #852, #853) inline.

Python migration guide:
- Sections that describe 3.0.0a3 changes get a `(3.0.0aN, [#PR](url))`
  tag in the heading: §6.6 (method, not property), §9 (DataFrame API),
  §10 (long-format sidecars), §11 (AttachedX). Single v2 → v3 framing
  preserved; per-section attribution makes "where did this land?"
  obvious without splitting the doc by alpha.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
termoshtt added a commit that referenced this pull request May 1, 2026
BLOCKING (Rust migration guide):
- The "mirrored on Solution / SampleSet" claim for the
  `*_constraint_metadata()` accessors is wrong — Solution / SampleSet
  do not flatten constraint metadata to host-level methods. They expose
  variable / named-function stores at the host (`solution.variable_metadata()`,
  `sample_set.variable_metadata()`, etc.) but constraint metadata is
  reached through the evaluated / sampled collection getter and then
  `.metadata()` on the collection (e.g.
  `solution.evaluated_constraints().metadata()`,
  `sample_set.indicator_constraints().metadata()`). Section split into
  Instance / ParametricInstance vs. Solution / SampleSet examples.

NICE-TO-FIX:
- §6.6 tag was missing the primary PR. The `parameters` (list) /
  `parameters_df` (DataFrame) split landed in #774 (3.0.0a1) when
  ParametricInstance became a Rust re-export; #846 (3.0.0a3) only
  flipped `_df` from `#[getter]` to method. Tag now lists both alphas
  with the corresponding PRs and the prose disambiguates.
- `VariableMetadataStore` / `NamedFunctionMetadataStore` shape note:
  not just "provenance omitted". `NamedFunctionMetadataStore` also
  lacks the `push_subscript` / `extend_subscripts` append helpers
  (callers use `set_subscripts(id, new_vec)` to extend). Note revised.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
termoshtt added a commit that referenced this pull request May 1, 2026
BLOCKING:
- Per-element to_bytes / from_bytes paragraph: rewritten. The previous
  text claimed Constraint / EvaluatedConstraint / SampledConstraint
  "still expose to_bytes/from_bytes that take an explicit ConstraintID"
  — these methods do not exist (rg "fn (to_bytes|from_bytes)" in
  rust/ommx/src returns only the host types and Function). The
  paragraph now says per-element to_bytes/from_bytes are no longer
  provided on any constraint kind, motivated by id + metadata living
  on the host.
- "Unified error surface" paragraph: softened from "the public API now
  returns a single error type" to "the default error type is
  ommx::Result<T>". The signal types (BoundError,
  DecisionVariableError, DuplicatedSampleIDError, SubstitutionError,
  ...) are surfaced typed at specific public APIs (Bound::new,
  DecisionVariable::new / set_bound / substitute, Sampled::append, the
  Substitute trait family, ...) — they are not internal-only signals
  reachable solely via downcast. The list now spells out where each is
  returned typed.

NICE-TO-FIX:
- "PropagateOutcome records a Provenance chain" was wrong. The enum
  has Active(T) / Consumed(T) / Transformed { original, new } — no
  Provenance field. Provenance is appended by callers writing into
  ConstraintMetadataStore. Bullet rewritten to describe the actual
  variants and the caller-side provenance step.
- Metadata-storage section over-attributed Python-only PRs as Rust
  evidence. Rust-side bundle trimmed to (#843, #848, #850, #853);
  #846, #847, #849, #852 moved into per-bullet citations on the
  Python-side paragraph.
- Tracing observability paragraph softened: it now acknowledges that
  legacy `.context(...)` call sites in narrow-domain parsers and the
  artifact layer remain, while new internal fail sites use the crate
  fail-site macros.

Other minor:
- "Serialization moves to the host level" wording (was "collection
  level") and ParametricInstance bullet mentions the sibling
  VariableMetadataStore / NamedFunctionMetadataStore.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
termoshtt added a commit that referenced this pull request May 1, 2026
Three NICE-TO-FIX items from Codex's fourth review pass:

- Python paragraph in §Metadata storage: split the single PR-list
  paragraph into two bullets so AttachedX (#849, #850, #852) and
  *_df (#846, #847) attributions stand on their own without the
  reader needing to cross-check the Python guide.
- Capability model wording (§3.0 overview bullet + §Capability model
  body): "any OMMX Instance can be fed to any adapter" overstated
  reduce_capabilities' infallibility. It returns Result, e.g.
  the Big-M encoding requires finite bounds. Both the bullet and the
  body sentence now say "valid OMMX Instance" and call out the
  fallible path with Err(ommx::Error).
- "Full signal-type list" -> "curated signal-type list" in the
  cross-reference to the error handling tutorial, matching the
  tutorial's own framing (it already uses "curated set").

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
termoshtt added a commit that referenced this pull request May 1, 2026
## Summary

Documentation hygiene pass for v3:

1. **Drop `METADATA_STORAGE_V3.md`** — design doc whose waves all
landed.
2. **Consolidate Rust release note** into one file
(`rust/ommx/doc/release_note.md`) covering the 3.0 line. Drop the
per-alpha submodule (`crate::doc::release_note::v3_0_0_alpha_1` removed;
`crate::doc::release_note` stays).
3. **Reframe the release note as topic-oriented "what & why"** — the API
enumerations live in the Rust migration guide and the error handling
tutorial; the release note links to them. Net 387 → 288 lines.
4. **Fix stale `metadata: ConstraintMetadata` references** in the Rust
migration guide (those pre-dated the SoA refactor).
5. **Tag every section** of the Python migration guide with `(3.0.0aN,
[#PR](url))` annotations and add three new sections covering 3.0.0a3
changes (DataFrame API, sidecar dfs, `AttachedX`).

Rust and Python SDKs are versioned independently, so Rust release-note
PR refs are bare `[#XXX](url)` and Python-guide tags use Python alpha
labels (`3.0.0a1` / `3.0.0a2` / `3.0.0a3`).

## Rust release note (`rust/ommx/doc/release_note.md`)

Topic-oriented v3 summary. Detail-heavy parts now defer to the Rust
migration guide / tutorial:

- `## 3.0` overview bullets — first-class special constraints, Stage /
`ConstraintType` collapse, host-level serialization, per-collection SoA
metadata stores, capability model, default `ommx::Result<T>` error
surface, `v1_ext` removal.
- `## Stage parameter and the ConstraintType trait` — motivation only;
struct / trait code blocks moved to migration guide §1 and §New Types.
- `## Collections and serialization` — `ConstraintCollection<T>` /
`EvaluatedCollection<T>` / `SampledCollection<T>` rationale + host-level
`to_bytes` / `from_bytes`. Migration guide owns the method lists.
- `## Metadata storage: SoA store on the enclosing collection` — three
store families (`ConstraintMetadataStore<ID>`, `VariableMetadataStore`,
`NamedFunctionMetadataStore`), the `pub(crate)` mutator tightening, and
a Python paragraph linking out to `PYTHON_SDK_MIGRATION_GUIDE.md` §9–11
with PR attributions for `AttachedX` (#849, #850, #852) and `*_df`
(#846, #847). Rust-side bundle is (#843, #848, #850, #853).
- `## Capability model` — `AdditionalCapability` / `Capabilities` /
`required_capabilities()` / `reduce_capabilities(&supported)`.
- `## Unified error surface` — default `ommx::Result<T>`, fail-site
macros (`bail!` / `error!` / `ensure!`), narrow-domain parser
exceptions, signal-type representative list. Full catalog deferred to
the error handling tutorial.
- `## Tracing-first observability` — acknowledges legacy `.context(...)`
call sites still exist in `qplib/parser` and `artifact`.
- `## Domain types replace v1_ext`, `## Other notable changes` —
unchanged.

## Rust migration guide (`rust/ommx/doc/migration_guide.md`)

- Remove stale `metadata: ConstraintMetadata` field references on
`Constraint<S>` and the struct-construction examples.
- New `### Metadata stores` section: per-host accessors on `Instance` /
`ParametricInstance` (`constraint_metadata()` / `_mut()`, indicator /
one-hot / sos1 variants, `variable_metadata()`,
`named_function_metadata()`); how `Solution` / `SampleSet` reach
constraint metadata via `evaluated_constraints().metadata()` etc.; the
`ConstraintMetadataStore<ID>` API (per-field reads, `collect_for(id)`,
write-through setters, bulk `insert` / `remove`); plus the
`VariableMetadataStore` / `NamedFunctionMetadataStore` shape differences
(no provenance; `NamedFunctionMetadataStore` also lacks `push_subscript`
/ `extend_subscripts`).
- Add `metadata` field to `ConstraintCollection` / `EvaluatedCollection`
/ `SampledCollection` shape blocks.
- Migration-checklist entry for `constraint.name` updated to point at
the SoA accessors.

## Python migration guide (`PYTHON_SDK_MIGRATION_GUIDE.md`)

- §6.6 wording: every `*_df` is a method.
- New §9 — DataFrame API: methods, `kind=` / `include=` / `removed=`
consolidating the per-kind / active-vs-removed family. `include=`
defaults to `("metadata", "parameters")`; `"removed_reason"` opt-in
(auto-enabled with `removed=True` on Instance / ParametricInstance).
- New §10 — Long-format sidecar DataFrames (`constraint_metadata_df`,
`constraint_provenance_df`, `variable_parameters_df`, …).
- New §11 — `instance.constraints[id]` etc. now return `AttachedX`
write-through handles. Three-column affected-types table (v2.5.1 /
3.0.0a2 / v3 final) so each hop is visible. `add_*` entry points and
`attached.detach()`.
- Convenience additions renumbered §9 → §12.
- Per-section `(3.0.0aN, [#PR](url))` tag on every section in §1–§11;
Overview themes (5, 6) and migration checklist extended.

## Test plan

- [ ] `cargo doc -p ommx --no-deps` renders the release note and
migration guide cleanly (verified locally).
- [ ] Spot-check rendered `ommx::doc::release_note` and
`ommx::doc::migration_guide` on the local `target/doc` build for
cross-reference resolution and section ordering.
- [ ] Spot-check rendered Python migration guide on GitHub: section
numbering 1–12, all alpha+PR tags resolve.
- [ ] No other doc / source file references `METADATA_STORAGE_V3.md` or
`release_note::v3_0_0_alpha_1` (verified locally).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Changes in Python SDK

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants