Skip to content

[Stack 21/27] Fix K-means k divergence: preserve vote-encounter row order#2453

Closed
jucor wants to merge 5 commits intojc/clj-parity-d15-moderation-handling-zeros-vs-removesfrom
jc/clj-parity-kmeans-k-divergence
Closed

[Stack 21/27] Fix K-means k divergence: preserve vote-encounter row order#2453
jucor wants to merge 5 commits intojc/clj-parity-d15-moderation-handling-zeros-vs-removesfrom
jc/clj-parity-kmeans-k-divergence

Conversation

@jucor
Copy link
Copy Markdown
Collaborator

@jucor jucor commented Mar 17, 2026

Summary

Stacked on #2452 (Fix D15: match Clojure moderation handling (zero out columns, don't remove)). Please review and merge #2452 first.

  • Fix K-means k divergence between Python and Clojure by preserving vote-encounter order for participant rows in the rating matrix
  • Python was using natsorted() (PID-numeric order) while Clojure's NamedMatrix preserves insertion order — different row ordering cascades into different first-k-distinct initialization seeds for group-level k-means
  • On vw: Python picked k=4 (wrong), Clojure picks k=2 — now both pick k=2 with identical cluster memberships

Investigation findings

The divergence chain: rating_mat row order → PCA projection order → base-cluster ID assignment → group k-means first-k-distinct init → different local optima → different silhouette landscape → different k.

PCA components are identical (cosine similarity = 1.0), silhouette implementation matches, k-means algorithm matches — only the data ORDER feeding first-k-distinct differed.

Changes

  • conversation.py: update_votes() preserves vote-encounter order for participant rows instead of natsorted()
  • conversation.py: _apply_moderation() preserves row order with list comprehension
  • Column (comment ID) ordering remains natsorted — doesn't affect clustering
  • Re-recorded vw cold-start blob and golden snapshots
  • Updated ordering tests, removed test_group_clustering xfail
  • Added scripts/investigate_k_divergence.py diagnostic tool

Cold-start blob results

Dataset Clj k Py k Match
vw 2 2 exact (sizes [50,17])
biodiversity 2 2 exact (sizes [81,19])
bg2018 2 2 close ([51,49] vs [52,48])
FLI 2 3 inherent PCA divergence (94.5% NaN, sil gap 0.001)

Test plan

  • All 297 tests pass (0 failures, 58 xfailed)
  • vw cold-start: k=2 exact match with Clojure blob
  • biodiversity cold-start: k=2 exact match
  • Ordering tests updated to expect encounter order
  • Re-record private dataset golden snapshots after stack rebase

🤖 Generated with Claude Code

@jucor jucor changed the title Fix K-means k divergence: preserve vote-encounter row order [Stack 19/25] Fix K-means k divergence: preserve vote-encounter row order Mar 17, 2026
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from d4e2154 to decac1a Compare March 17, 2026 20:51
@jucor jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 5438582 to afcda6d Compare March 18, 2026 18:23
jucor added a commit that referenced this pull request Mar 18, 2026
- Plan: replaced investigation section with resolved findings, updated
  checklist (K-inv DONE), added PR #2453 to cross-reference table
- Journal: added session 12 entry with investigation methodology,
  root cause (natsorted row ordering), fix, and cold-start blob results

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from decac1a to 7611c85 Compare March 18, 2026 19:02
jucor added a commit that referenced this pull request Mar 18, 2026
- Plan: replaced investigation section with resolved findings, updated
  checklist (K-inv DONE), added PR #2453 to cross-reference table
- Journal: added session 12 entry with investigation methodology,
  root cause (natsorted row ordering), fix, and cold-start blob results

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jucor jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 59ea651 to 084551e Compare March 18, 2026 19:02
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 7611c85 to f51d33f Compare March 18, 2026 19:13
@jucor jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch 2 times, most recently from 49e8745 to 4def564 Compare March 19, 2026 10:23
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from f51d33f to 19a64ef Compare March 19, 2026 10:23
@jucor jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 4def564 to 9a34efe Compare March 19, 2026 10:49
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 19a64ef to 29283da Compare March 19, 2026 10:49
@jucor jucor changed the title [Stack 19/25] Fix K-means k divergence: preserve vote-encounter row order [Stack 18/24] Fix K-means k divergence: preserve vote-encounter row order Mar 19, 2026
@jucor jucor marked this pull request as ready for review March 19, 2026 12:08
@jucor jucor requested a review from Copilot March 19, 2026 12:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes cold-start K-means k divergence between Python and Clojure by aligning the participant (row) ordering of the rating matrix with Clojure’s insertion/vote-encounter order, ensuring downstream clustering initialization matches.

Changes:

  • Preserve participant row encounter order in Conversation.update_votes() and when filtering moderated participants in _apply_moderation().
  • Update ordering-related unit tests to expect encounter-ordered participant rows while keeping comment columns natsorted.
  • Update Clojure-regression clustering test behavior (remove broad xfail; xfail incremental blobs only) and re-record the vw cold-start blob/goldens; add investigation documentation.

Reviewed changes

Copilot reviewed 7 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
delphi/polismath/conversation/conversation.py Preserves participant row order based on vote stream; keeps columns natsorted; moderation row filtering now order-preserving.
delphi/tests/test_conversation.py Adjusts ordering expectations/tests for participant encounter order (columns still natsorted).
delphi/tests/test_legacy_clojure_regression.py Removes previous xfail and conditionally xfails clustering comparison for incremental blobs.
delphi/real_data/r6vbnhffkxbd7ifmfbdrd-vw/r6vbnhffkxbd7ifmfbdrd_math_blob_cold_start.json Re-recorded vw cold-start blob/golden outputs (including ordering-sensitive downstream values).
delphi/docs/PLAN_DISCREPANCY_FIXES.md Marks K-divergence row-order fix as done and summarizes results.
delphi/docs/HANDOFF_K_DIVERGENCE_INVESTIGATION.md Adds investigation write-up documenting root cause and resolution.
delphi/docs/CLJ-PARITY-FIXES-JOURNAL.md Journals the investigation and fix details/results.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +230 to +248
# Row order: preserve first-appearance order from votes.
#
# Clojure builds the rating matrix incrementally — each new participant
# gets a row appended in the order they first appear in the vote stream
# (conversation.clj, named_matrix.clj: NamedMatrix preserves insertion
# order via IndexHash backed by java.util.Vector). The base-cluster IDs
# are assigned by map-indexed on this row order, so the order directly
# determines group-level k-means initialization via first-k-distinct.
#
# Using natsort (PID-numeric order) instead would change the k-means
# seed points and produce different silhouette scores / different k.
# See delphi/docs/HANDOFF_K_DIVERGENCE_INVESTIGATION.md for the full
# analysis showing this is the root cause of k divergence on vw.
new_rows_ordered = []
for pid, _, _ in vote_updates:
if pid in new_rows and pid not in existing_rows_set:
existing_rows_set.add(pid)
new_rows_ordered.append(pid)
all_rows = list(existing_rows) + new_rows_ordered
# Incremental blobs are progressive snapshots — in-conv sets differ
# from single-shot computation, so clustering comparison is not valid.
if blob_type == 'incremental':
pytest.xfail("Incremental blobs have different in-conv from single-shot")
], ids=lambda test_desc, *args: test_desc if isinstance(test_desc, str) else str(test_desc))
def test_natural_sorting_homogeneous_types(self, test_desc, ptpt_ids, comment_ids, expected_ptpt_types, expected_ptpts_sorted, expected_comment_types, expected_comments_sorted):
"""Test natural sorting with homogeneous ID types (all same type).
def test_natural_sorting_homogeneous_types(self, test_desc, ptpt_ids, comment_ids, expected_ptpt_types, expected_ptpts_ordered, expected_comment_types, expected_comments_sorted):
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 29283da to 9149555 Compare March 19, 2026 12:33
jucor added a commit that referenced this pull request Mar 19, 2026
- Plan: replaced investigation section with resolved findings, updated
  checklist (K-inv DONE), added PR #2453 to cross-reference table
- Journal: added session 12 entry with investigation methodology,
  root cause (natsorted row ordering), fix, and cold-start blob results

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jucor jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from e4d490a to d372197 Compare March 19, 2026 12:33
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 9149555 to 2b015fd Compare March 19, 2026 14:52
jucor added a commit that referenced this pull request Mar 19, 2026
- Plan: replaced investigation section with resolved findings, updated
  checklist (K-inv DONE), added PR #2453 to cross-reference table
- Journal: added session 12 entry with investigation methodology,
  root cause (natsorted row ordering), fix, and cold-start blob results

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jucor jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from d372197 to e10d972 Compare March 19, 2026 14:52
@jucor jucor changed the title [Stack 18/24] Fix K-means k divergence: preserve vote-encounter row order [Stack 19/25] Fix K-means k divergence: preserve vote-encounter row order Mar 19, 2026
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 2b015fd to 176da37 Compare March 23, 2026 15:34
jucor added a commit that referenced this pull request Mar 23, 2026
- Plan: replaced investigation section with resolved findings, updated
  checklist (K-inv DONE), added PR #2453 to cross-reference table
- Journal: added session 12 entry with investigation methodology,
  root cause (natsorted row ordering), fix, and cold-start blob results

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jucor jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from e10d972 to 94b9ebd Compare March 23, 2026 15:35
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 176da37 to 13fedd6 Compare March 23, 2026 15:41
jucor added a commit that referenced this pull request Mar 23, 2026
- Plan: replaced investigation section with resolved findings, updated
  checklist (K-inv DONE), added PR #2453 to cross-reference table
- Journal: added session 12 entry with investigation methodology,
  root cause (natsorted row ordering), fix, and cold-start blob results

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from e07bd52 to baeccff Compare March 24, 2026 09:04
jucor added a commit that referenced this pull request Mar 24, 2026
- Plan: replaced investigation section with resolved findings, updated
  checklist (K-inv DONE), added PR #2453 to cross-reference table
- Journal: added session 12 entry with investigation methodology,
  root cause (natsorted row ordering), fix, and cold-start blob results

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jucor jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 41a7896 to 7218d03 Compare March 24, 2026 09:04
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch 2 times, most recently from 42db748 to c4d3382 Compare March 24, 2026 11:20
jucor added a commit that referenced this pull request Mar 24, 2026
- Plan: replaced investigation section with resolved findings, updated
  checklist (K-inv DONE), added PR #2453 to cross-reference table
- Journal: added session 12 entry with investigation methodology,
  root cause (natsorted row ordering), fix, and cold-start blob results

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jucor jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 7218d03 to 7bbf0f2 Compare March 24, 2026 11:20
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch 3 times, most recently from 4e68f68 to 7313bc0 Compare March 24, 2026 11:35
jucor added a commit that referenced this pull request Mar 24, 2026
- Plan: replaced investigation section with resolved findings, updated
  checklist (K-inv DONE), added PR #2453 to cross-reference table
- Journal: added session 12 entry with investigation methodology,
  root cause (natsorted row ordering), fix, and cold-start blob results

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jucor jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 7bbf0f2 to 90368ee Compare March 24, 2026 11:38
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 7313bc0 to d8d3f24 Compare March 24, 2026 11:46
jucor added a commit that referenced this pull request Mar 24, 2026
- Plan: replaced investigation section with resolved findings, updated
  checklist (K-inv DONE), added PR #2453 to cross-reference table
- Journal: added session 12 entry with investigation methodology,
  root cause (natsorted row ordering), fix, and cold-start blob results

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jucor jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch 2 times, most recently from 60965ae to 57b6c56 Compare March 26, 2026 21:24
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch 2 times, most recently from 2ce0b36 to 19e36d8 Compare March 27, 2026 01:15
@jucor jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 57b6c56 to 4932faa Compare March 27, 2026 01:15
@jucor jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 19e36d8 to 42a795a Compare March 27, 2026 01:53
@jucor jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 4932faa to 1a66f28 Compare March 27, 2026 01:53
@github-actions
Copy link
Copy Markdown

Delphi Coverage Report

File Stmts Miss Cover
init.py 2 0 100%
benchmarks/bench_pca.py 76 76 0%
benchmarks/bench_repness.py 81 81 0%
benchmarks/bench_update_votes.py 38 38 0%
benchmarks/benchmark_utils.py 34 34 0%
components/init.py 1 0 100%
components/config.py 165 133 19%
conversation/init.py 2 0 100%
conversation/conversation.py 1114 320 71%
conversation/manager.py 131 42 68%
database/init.py 1 0 100%
database/dynamodb.py 387 234 40%
database/postgres.py 305 205 33%
pca_kmeans_rep/init.py 5 0 100%
pca_kmeans_rep/clusters.py 257 22 91%
pca_kmeans_rep/corr.py 98 17 83%
pca_kmeans_rep/pca.py 52 16 69%
pca_kmeans_rep/repness.py 312 34 89%
regression/init.py 4 0 100%
regression/clojure_comparer.py 188 20 89%
regression/comparer.py 887 720 19%
regression/datasets.py 135 27 80%
regression/recorder.py 36 27 25%
regression/utils.py 138 87 37%
run_math_pipeline.py 260 114 56%
umap_narrative/500_generate_embedding_umap_cluster.py 210 109 48%
umap_narrative/501_calculate_comment_extremity.py 112 53 53%
umap_narrative/502_calculate_priorities.py 135 135 0%
umap_narrative/700_datamapplot_for_layer.py 502 502 0%
umap_narrative/701_static_datamapplot_for_layer.py 310 310 0%
umap_narrative/702_consensus_divisive_datamapplot.py 432 432 0%
umap_narrative/801_narrative_report_batch.py 785 785 0%
umap_narrative/802_process_batch_results.py 265 265 0%
umap_narrative/803_check_batch_status.py 175 175 0%
umap_narrative/llm_factory_constructor/init.py 2 2 0%
umap_narrative/llm_factory_constructor/model_provider.py 157 157 0%
umap_narrative/polismath_commentgraph/init.py 1 0 100%
umap_narrative/polismath_commentgraph/cli.py 270 270 0%
umap_narrative/polismath_commentgraph/core/init.py 3 3 0%
umap_narrative/polismath_commentgraph/core/clustering.py 108 108 0%
umap_narrative/polismath_commentgraph/core/embedding.py 104 104 0%
umap_narrative/polismath_commentgraph/lambda_handler.py 219 219 0%
umap_narrative/polismath_commentgraph/schemas/init.py 2 0 100%
umap_narrative/polismath_commentgraph/schemas/dynamo_models.py 160 9 94%
umap_narrative/polismath_commentgraph/tests/conftest.py 17 17 0%
umap_narrative/polismath_commentgraph/tests/test_clustering.py 74 74 0%
umap_narrative/polismath_commentgraph/tests/test_embedding.py 55 55 0%
umap_narrative/polismath_commentgraph/tests/test_storage.py 87 87 0%
umap_narrative/polismath_commentgraph/utils/init.py 3 0 100%
umap_narrative/polismath_commentgraph/utils/converter.py 283 237 16%
umap_narrative/polismath_commentgraph/utils/group_data.py 354 336 5%
umap_narrative/polismath_commentgraph/utils/storage.py 584 518 11%
umap_narrative/reset_conversation.py 159 50 69%
umap_narrative/run_pipeline.py 453 312 31%
utils/general.py 62 41 34%
Total 10792 7612 29%

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 9 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

], ids=lambda test_desc, *args: test_desc if isinstance(test_desc, str) else str(test_desc))
def test_natural_sorting_homogeneous_types(self, test_desc, ptpt_ids, comment_ids, expected_ptpt_types, expected_ptpts_sorted, expected_comment_types, expected_comments_sorted):
"""Test natural sorting with homogeneous ID types (all same type).
def test_natural_sorting_homogeneous_types(self, test_desc, ptpt_ids, comment_ids, expected_ptpt_types, expected_ptpts_ordered, expected_comment_types, expected_comments_sorted):
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test now validates encounter-order preservation rather than “natural sorting”, but the function name still refers to natural sorting. Renaming the test to reflect the new behavior would make failures easier to interpret.

Suggested change
def test_natural_sorting_homogeneous_types(self, test_desc, ptpt_ids, comment_ids, expected_ptpt_types, expected_ptpts_ordered, expected_comment_types, expected_comments_sorted):
def test_homogeneous_types_encounter_order_and_natural_sorting(self, test_desc, ptpt_ids, comment_ids, expected_ptpt_types, expected_ptpts_ordered, expected_comment_types, expected_comments_sorted):

Copilot uses AI. Check for mistakes.
jucor and others added 5 commits March 30, 2026 18:04
…nt rows

Root cause: Python's natsorted() sorted rating matrix rows by PID, while
Clojure's NamedMatrix preserves vote-encounter order (insertion order via
java.util.Vector). Different row ordering cascades through base-cluster ID
assignment into group-level k-means first-k-distinct initialization,
producing different local optima and different silhouette landscapes.

On vw: PID-sorted order → k=4 (sil=0.508), encounter order → k=2 (sil=0.487).
Clojure blob has k=2. After fix, Python also picks k=2.

Changes:
- update_votes(): track first-appearance order from vote_updates, append new
  PIDs in encounter order instead of natsorted
- _apply_moderation(): preserve raw_rating_mat row order with list
  comprehension instead of natsorted
- Column (comment ID) ordering remains natsorted — column permutation doesn't
  affect PCA eigenvalues/vectors

Results on cold-start blobs:
- vw: k=2 exact match (was k=4), sizes [50,17] exact
- biodiversity: k=2 exact match, sizes [81,19] exact
- bg2018: k=2 match, sizes close ([52,48] vs [51,49])
- FLI: k=3 vs k=2 — inherent PCA divergence (94.5% NaN sparsity,
  silhouette gap 0.001), not fixable without replicating power iteration

Also: re-recorded vw cold-start blob, golden snapshots, updated ordering
tests, removed group_clustering xfail, added investigation script.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Plan: replaced investigation section with resolved findings, updated
  checklist (K-inv DONE), added PR #2453 to cross-reference table
- Journal: added session 12 entry with investigation methodology,
  root cause (natsorted row ordering), fix, and cold-start blob results

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jucor jucor mentioned this pull request Mar 30, 2026
4 tasks
@jucor
Copy link
Copy Markdown
Collaborator Author

jucor commented Mar 30, 2026

Superseded by spr-managed PR stack. See the new stack starting at #2508.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants