diff --git a/openspec/changes/ooxml-paragraph-index-semantics/.openspec.yaml b/openspec/changes/ooxml-paragraph-index-semantics/.openspec.yaml new file mode 100644 index 0000000..2e10e34 --- /dev/null +++ b/openspec/changes/ooxml-paragraph-index-semantics/.openspec.yaml @@ -0,0 +1,4 @@ +schema: spec-driven +created: 2026-05-02 +created_by: che cheng +created_with: claude diff --git a/openspec/changes/ooxml-paragraph-index-semantics/design.md b/openspec/changes/ooxml-paragraph-index-semantics/design.md new file mode 100644 index 0000000..9f49796 --- /dev/null +++ b/openspec/changes/ooxml-paragraph-index-semantics/design.md @@ -0,0 +1,77 @@ +## Context + +`WordDocument` currently mixes at least three index spaces under similar integer labels: + +- `insertParagraph(_:at:)` uses a raw top-level `body.children` index. +- `updateParagraph(at:)`, `deleteParagraph(at:)`, formatting APIs, and list APIs translate through top-level paragraphs only. +- `getParagraphs()` returns a recursive paragraph list that includes nested containers and is not compatible with either top-level index space. + +This ambiguity affects downstream automation because agents often pass the count or position returned by one API into another. `che-word-mcp` already compensates at selected call sites, but the library contract remains unclear. + +## Goals / Non-Goals + +**Goals:** + +- Make every paragraph/body mutation API name reveal its index basis. +- Preserve source compatibility during one minor release while producing deprecation warnings for ambiguous legacy entry points. +- Avoid a same-signature semantic swap that would silently change existing callers. +- Centralize index validation so body-child and paragraph-only semantics are implemented consistently. +- Give `che-word-mcp` a clear migration target. + +**Non-Goals:** + +- Introduce compile-time index wrapper types in this change. +- Change recursive read ordering from `getParagraphs()`. +- Implement downstream `che-word-mcp` migration in the Spectra proposal PR. + +## Decisions + +### Use explicit labels instead of changing same-signature semantics immediately + +Adopt Option A's underlying semantic direction: body insertion and structural mutation APIs use top-level body-child positions, while paragraph-only operations use explicitly named paragraph positions. However, do not immediately change the behavior of existing same-signature methods such as `updateParagraph(at:)` in a minor release. + +Rationale: Swift cannot expose both `updateParagraph(at:)` with legacy paragraph-index behavior and `updateParagraph(at:)` with body-child behavior at the same time. Reusing the same signature with new behavior would silently break existing callers and violate the library's human-like operation principles. A staged rename/deprecation path gives callers compiler-visible migration guidance. + +Alternatives considered: + +- Option A immediate semantic swap: rejected for minor release because existing callers would compile and behave differently. +- Option B paragraph-index unification: rejected because body-level insertion must be able to target positions around tables, content controls, and body-level markers. +- Option C index newtypes: deferred because it is the most correct long-term surface but too invasive for the immediate cleanup. + +### Introduce body-child APIs and preserve ambiguous methods during one minor release + +Add explicit body-child APIs for mutation points that operate on top-level `body.children`, for example `updateParagraph(atBodyChildIndex:text:)` and `deleteParagraph(atBodyChildIndex:)`. Existing ambiguous APIs stay available for one minor release with deprecation warnings and unchanged behavior. + +Rationale: callers get a mechanical migration path without losing the ability to compile. New code can immediately select the correct index space. + +### Introduce paragraph-index APIs for paragraph-only operations + +Add explicitly named paragraph-index APIs for operations that intentionally target the nth top-level paragraph, for example `updateTopLevelParagraph(atParagraphIndex:text:)`, `deleteTopLevelParagraph(atParagraphIndex:)`, `formatTopLevelParagraph(atParagraphIndex:with:)`, and list-formatting equivalents. + +Rationale: paragraph-only operations remain useful, but their names must not imply compatibility with raw body-child positions or recursive paragraph reads. + +### Document recursive paragraph reads as non-index-compatible + +Document `getParagraphs()` as a read convenience that returns recursive paragraph content and does not produce indexes suitable for body mutation APIs. + +Rationale: this is the most common agent mistake: use a recursive read result, then pass that ordinal into a top-level mutator. Documentation and named APIs must prevent that confusion. + +## Risks / Trade-offs + +- Existing callers may ignore deprecation warnings -> migration guide and `che-word-mcp` follow-up PR use the explicit labels immediately. +- More API names increase surface area -> each name carries its index basis, reducing hidden semantics. +- Body-child APIs need clear refusal when the target is not a paragraph -> throw `WordError.invalidIndex` or a more specific existing-compatible error rather than mutating a different paragraph. +- Future index newtypes remain deferred -> leave names compatible with a later typed-index overload if the team chooses Option C. + +## Migration Plan + +1. Add shared internal helpers for `bodyChildIndex` and `topLevelParagraphIndex` validation. +2. Add explicit body-child and paragraph-index public APIs. +3. Deprecate ambiguous legacy APIs without changing runtime behavior in the same minor release. +4. Migrate `che-word-mcp` callers to the explicit APIs in a separate implementation PR. +5. Remove or repurpose ambiguous legacy APIs only in the next major release, after callers have compiler-visible warning time. + +## Open Questions + +- Whether the eventual major release removes ambiguous `at:` APIs entirely or reintroduces them only for body-child semantics. +- Whether Option C typed indexes become mandatory after the explicit-label migration proves stable. diff --git a/openspec/changes/ooxml-paragraph-index-semantics/proposal.md b/openspec/changes/ooxml-paragraph-index-semantics/proposal.md new file mode 100644 index 0000000..c812eb2 --- /dev/null +++ b/openspec/changes/ooxml-paragraph-index-semantics/proposal.md @@ -0,0 +1,36 @@ +## Why + +`ooxml-swift` currently exposes paragraph mutator APIs whose `at: Int` parameters mean different things depending on the method: some use raw `body.children` positions while others translate through top-level paragraph-only indexes. This creates silent caller confusion and already forced downstream `che-word-mcp` to carry call-site workarounds. + +## What Changes + +- Standardize document body insertion and mutation APIs around explicitly named index bases, with body-position APIs using top-level `body.children` indexes. +- Add explicitly named paragraph-only APIs for callers that need nth top-level paragraph semantics. +- Add shared internal helpers for body-child and top-level-paragraph index validation instead of repeating ad-hoc `paragraphIndices` translation. +- Deprecate legacy paragraph-only overloads whose names do not state their index basis, with a one-minor-release warning period before any major removal. +- Document that recursive read APIs such as `getParagraphs()` are not index-compatible with top-level body mutation APIs. + +## Non-Goals + +- Do not introduce compile-time index newtypes in this change. They remain a future option if callers continue mixing index spaces after the naming cleanup. +- Do not change recursive `getParagraphs()` ordering or scope. +- Do not migrate downstream `che-word-mcp` behavior in this proposal; that belongs in a coordinated follow-up implementation PR. + +## Capabilities + +### New Capabilities + +(none) + +### Modified Capabilities + +- `ooxml-document-part-mutations`: define consistent paragraph/body index semantics for `WordDocument` mutator APIs. + +## Impact + +- Affected specs: `ooxml-document-part-mutations` +- Affected code: + - Modified: packages/ooxml-swift/Sources/OOXMLSwift/Models/Document.swift + - Modified: packages/ooxml-swift/Tests/OOXMLSwiftTests + - Modified: mcp/che-word-mcp/Sources +- Related issue: PsychQuant/ooxml-swift#10 diff --git a/openspec/changes/ooxml-paragraph-index-semantics/specs/ooxml-document-part-mutations/spec.md b/openspec/changes/ooxml-paragraph-index-semantics/specs/ooxml-document-part-mutations/spec.md new file mode 100644 index 0000000..24804fa --- /dev/null +++ b/openspec/changes/ooxml-paragraph-index-semantics/specs/ooxml-document-part-mutations/spec.md @@ -0,0 +1,57 @@ +## ADDED Requirements + +### Requirement: Body-child mutation APIs expose body-child index semantics + +`WordDocument` mutation APIs that target top-level document body positions SHALL expose parameter labels that include `bodyChildIndex` when the integer refers to `body.children`. Such APIs SHALL validate against the top-level body-child collection and SHALL NOT translate through a paragraph-only list. + +#### Scenario: Body-child insert targets the table position + +- **WHEN** a document body contains `[paragraph("A"), table, paragraph("B")]` and the caller inserts `paragraph("X")` at body child index `1` +- **THEN** the document body order SHALL become `[paragraph("A"), paragraph("X"), table, paragraph("B")]` + +#### Scenario: Body-child update refuses non-paragraph targets + +- **WHEN** a document body contains `[paragraph("A"), table, paragraph("B")]` and the caller updates a paragraph at body child index `1` +- **THEN** the operation SHALL throw an index or target-type error +- **AND** the operation SHALL NOT update `paragraph("B")` + +### Requirement: Top-level paragraph mutation APIs expose paragraph-index semantics + +`WordDocument` mutation APIs that intentionally target the nth top-level paragraph SHALL expose parameter labels that include `topLevelParagraphIndex` or `atParagraphIndex`. Such APIs SHALL translate only among top-level body paragraphs and SHALL NOT include table-cell paragraphs, header/footer paragraphs, footnote paragraphs, endnote paragraphs, or block-level content-control descendants. + +#### Scenario: Top-level paragraph update skips tables but keeps index basis explicit + +- **WHEN** a document body contains `[paragraph("A"), table, paragraph("B")]` and the caller updates top-level paragraph index `1` to `"C"` +- **THEN** the document body order SHALL become `[paragraph("A"), table, paragraph("C")]` + +#### Scenario: Nested paragraph is not addressable by top-level paragraph index + +- **WHEN** a document body contains `[paragraph("A"), table(cell paragraph "T"), paragraph("B")]` +- **THEN** top-level paragraph index `1` SHALL refer to `paragraph("B")` +- **AND** it SHALL NOT refer to the table-cell paragraph `"T"` + +### Requirement: Ambiguous legacy index APIs preserve behavior during deprecation + +Existing `WordDocument` APIs whose integer labels do not name the index basis SHALL NOT change runtime semantics in a minor release. They SHALL be marked deprecated with messages that name the replacement body-child or paragraph-index API. + +#### Scenario: Deprecated paragraph-only update keeps existing behavior + +- **WHEN** existing caller code invokes a deprecated paragraph-only API on `[paragraph("A"), table, paragraph("B")]` with index `1` +- **THEN** the minor-release runtime behavior SHALL keep targeting `paragraph("B")` +- **AND** recompilation SHALL emit a deprecation warning that names the explicit paragraph-index replacement + +#### Scenario: Deprecated body-child insert names explicit replacement + +- **WHEN** existing caller code invokes an ambiguous body-child insertion API with `at: 1` +- **THEN** recompilation SHALL emit a deprecation warning that names the explicit body-child replacement +- **AND** the minor-release runtime behavior SHALL keep inserting at body child index `1` + +### Requirement: Recursive paragraph reads are not mutation indexes + +Recursive paragraph read APIs such as `getParagraphs()` SHALL document that their returned order is a read-only content order and SHALL NOT be used as an index source for top-level body mutation APIs. + +#### Scenario: Recursive read ordinal differs from body child index + +- **WHEN** `getParagraphs()` returns `[paragraph("A"), table-cell paragraph("T"), paragraph("B")]` for a body containing `[paragraph("A"), table(cell paragraph "T"), paragraph("B")]` +- **THEN** recursive read ordinal `2` SHALL NOT be documented as the mutation index for `paragraph("B")` +- **AND** callers that need to mutate `paragraph("B")` SHALL use an explicit top-level paragraph-index API or an explicit body-child-index API diff --git a/openspec/changes/ooxml-paragraph-index-semantics/tasks.md b/openspec/changes/ooxml-paragraph-index-semantics/tasks.md new file mode 100644 index 0000000..259d310 --- /dev/null +++ b/openspec/changes/ooxml-paragraph-index-semantics/tasks.md @@ -0,0 +1,23 @@ +## 1. API Surface + +- [ ] 1.1 Audit `WordDocument` integer mutators and list every body-child, top-level paragraph, and recursive paragraph index entry point. +- [ ] 1.2 Implement shared validation helpers for body child indexes and top-level paragraph indexes in `Document.swift`. +- [ ] 1.3 Implement explicit body-child APIs so Body-child mutation APIs expose body-child index semantics. +- [ ] 1.4 Implement explicit paragraph-index APIs so Top-level paragraph mutation APIs expose paragraph-index semantics. + +## 2. Deprecation and Documentation + +- [ ] 2.1 Apply the design decision Use explicit labels instead of changing same-signature semantics immediately by preserving existing ambiguous method behavior. +- [ ] 2.2 Apply the design decision Introduce body-child APIs and preserve ambiguous methods during one minor release by adding deprecation messages that name body-child replacements. +- [ ] 2.3 Apply the design decision Introduce paragraph-index APIs for paragraph-only operations by adding deprecation messages that name paragraph-index replacements. +- [ ] 2.4 Ensure Ambiguous legacy index APIs preserve behavior during deprecation with source-level warning tests. +- [ ] 2.5 Apply the design decision Document recursive paragraph reads as non-index-compatible by updating `getParagraphs()` documentation. +- [ ] 2.6 Ensure Recursive paragraph reads are not mutation indexes in public documentation and migration notes. + +## 3. Tests and Downstream Migration + +- [ ] 3.1 Add tests for body-child insert/update/delete behavior around tables and content controls. +- [ ] 3.2 Add tests for top-level paragraph mutation behavior around tables and nested paragraphs. +- [ ] 3.3 Add tests proving deprecated ambiguous APIs keep minor-release runtime behavior. +- [ ] 3.4 Update `che-word-mcp` callers to use explicit body-child or paragraph-index APIs. +- [ ] 3.5 Run full `swift test` in `packages/ooxml-swift` and the relevant `che-word-mcp` test suite.