[Stack 21/27] Fix K-means k divergence: preserve vote-encounter row order by jucor · Pull Request #2453 · compdemocracy/polis

jucor · 2026-03-17T20:38:15Z

Summary

Stacked on #2452 (Fix D15: match Clojure moderation handling (zero out columns, don't remove)). Please review and merge #2452 first.

Fix K-means k divergence between Python and Clojure by preserving vote-encounter order for participant rows in the rating matrix
Python was using natsorted() (PID-numeric order) while Clojure's NamedMatrix preserves insertion order — different row ordering cascades into different first-k-distinct initialization seeds for group-level k-means
On vw: Python picked k=4 (wrong), Clojure picks k=2 — now both pick k=2 with identical cluster memberships

Investigation findings

The divergence chain: rating_mat row order → PCA projection order → base-cluster ID assignment → group k-means first-k-distinct init → different local optima → different silhouette landscape → different k.

PCA components are identical (cosine similarity = 1.0), silhouette implementation matches, k-means algorithm matches — only the data ORDER feeding first-k-distinct differed.

Changes

conversation.py: update_votes() preserves vote-encounter order for participant rows instead of natsorted()
conversation.py: _apply_moderation() preserves row order with list comprehension
Column (comment ID) ordering remains natsorted — doesn't affect clustering
Re-recorded vw cold-start blob and golden snapshots
Updated ordering tests, removed test_group_clustering xfail
Added scripts/investigate_k_divergence.py diagnostic tool

Cold-start blob results

Dataset	Clj k	Py k	Match
vw	2	2	exact (sizes [50,17])
biodiversity	2	2	exact (sizes [81,19])
bg2018	2	2	close ([51,49] vs [52,48])
FLI	2	3	inherent PCA divergence (94.5% NaN, sil gap 0.001)

Test plan

All 297 tests pass (0 failures, 58 xfailed)
vw cold-start: k=2 exact match with Clojure blob
biodiversity cold-start: k=2 exact match
Ordering tests updated to expect encounter order
Re-record private dataset golden snapshots after stack rebase

🤖 Generated with Claude Code

- Plan: replaced investigation section with resolved findings, updated checklist (K-inv DONE), added PR #2453 to cross-reference table - Journal: added session 12 entry with investigation methodology, root cause (natsorted row ordering), fix, and cold-start blob results Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR fixes cold-start K-means k divergence between Python and Clojure by aligning the participant (row) ordering of the rating matrix with Clojure’s insertion/vote-encounter order, ensuring downstream clustering initialization matches.

Changes:

Preserve participant row encounter order in Conversation.update_votes() and when filtering moderated participants in _apply_moderation().
Update ordering-related unit tests to expect encounter-ordered participant rows while keeping comment columns natsorted.
Update Clojure-regression clustering test behavior (remove broad xfail; xfail incremental blobs only) and re-record the vw cold-start blob/goldens; add investigation documentation.

Reviewed changes

Copilot reviewed 7 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`delphi/polismath/conversation/conversation.py`	Preserves participant row order based on vote stream; keeps columns `natsorted`; moderation row filtering now order-preserving.
`delphi/tests/test_conversation.py`	Adjusts ordering expectations/tests for participant encounter order (columns still `natsorted`).
`delphi/tests/test_legacy_clojure_regression.py`	Removes previous xfail and conditionally xfails clustering comparison for incremental blobs.
`delphi/real_data/r6vbnhffkxbd7ifmfbdrd-vw/r6vbnhffkxbd7ifmfbdrd_math_blob_cold_start.json`	Re-recorded vw cold-start blob/golden outputs (including ordering-sensitive downstream values).
`delphi/docs/PLAN_DISCREPANCY_FIXES.md`	Marks K-divergence row-order fix as done and summarizes results.
`delphi/docs/HANDOFF_K_DIVERGENCE_INVESTIGATION.md`	Adds investigation write-up documenting root cause and resolution.
`delphi/docs/CLJ-PARITY-FIXES-JOURNAL.md`	Journals the investigation and fix details/results.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

delphi/polismath/conversation/conversation.py

+        # Row order: preserve first-appearance order from votes.
+        #
+        # Clojure builds the rating matrix incrementally — each new participant
+        # gets a row appended in the order they first appear in the vote stream
+        # (conversation.clj, named_matrix.clj: NamedMatrix preserves insertion
+        # order via IndexHash backed by java.util.Vector). The base-cluster IDs
+        # are assigned by map-indexed on this row order, so the order directly
+        # determines group-level k-means initialization via first-k-distinct.
+        #
+        # Using natsort (PID-numeric order) instead would change the k-means
+        # seed points and produce different silhouette scores / different k.
+        # See delphi/docs/HANDOFF_K_DIVERGENCE_INVESTIGATION.md for the full
+        # analysis showing this is the root cause of k divergence on vw.
+        new_rows_ordered = []
+        for pid, _, _ in vote_updates:
+            if pid in new_rows and pid not in existing_rows_set:
+                existing_rows_set.add(pid)
+                new_rows_ordered.append(pid)
+        all_rows = list(existing_rows) + new_rows_ordered


delphi/tests/test_legacy_clojure_regression.py

+        # Incremental blobs are progressive snapshots — in-conv sets differ
+        # from single-shot computation, so clustering comparison is not valid.
+        if blob_type == 'incremental':
+            pytest.xfail("Incremental blobs have different in-conv from single-shot")


delphi/tests/test_conversation.py

    ], ids=lambda test_desc, *args: test_desc if isinstance(test_desc, str) else str(test_desc))
-    def test_natural_sorting_homogeneous_types(self, test_desc, ptpt_ids, comment_ids, expected_ptpt_types, expected_ptpts_sorted, expected_comment_types, expected_comments_sorted):
-        """Test natural sorting with homogeneous ID types (all same type).
+    def test_natural_sorting_homogeneous_types(self, test_desc, ptpt_ids, comment_ids, expected_ptpt_types, expected_ptpts_ordered, expected_comment_types, expected_comments_sorted):


- Plan: replaced investigation section with resolved findings, updated checklist (K-inv DONE), added PR #2453 to cross-reference table - Journal: added session 12 entry with investigation methodology, root cause (natsorted row ordering), fix, and cold-start blob results Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-03-30T14:01:21Z

Delphi Coverage Report

File	Stmts	Miss	Cover
init.py	2	0	100%
benchmarks/bench_pca.py	76	76	0%
benchmarks/bench_repness.py	81	81	0%
benchmarks/bench_update_votes.py	38	38	0%
benchmarks/benchmark_utils.py	34	34	0%
components/init.py	1	0	100%
components/config.py	165	133	19%
conversation/init.py	2	0	100%
conversation/conversation.py	1114	320	71%
conversation/manager.py	131	42	68%
database/init.py	1	0	100%
database/dynamodb.py	387	234	40%
database/postgres.py	305	205	33%
pca_kmeans_rep/init.py	5	0	100%
pca_kmeans_rep/clusters.py	257	22	91%
pca_kmeans_rep/corr.py	98	17	83%
pca_kmeans_rep/pca.py	52	16	69%
pca_kmeans_rep/repness.py	312	34	89%
regression/init.py	4	0	100%
regression/clojure_comparer.py	188	20	89%
regression/comparer.py	887	720	19%
regression/datasets.py	135	27	80%
regression/recorder.py	36	27	25%
regression/utils.py	138	87	37%
run_math_pipeline.py	260	114	56%
umap_narrative/500_generate_embedding_umap_cluster.py	210	109	48%
umap_narrative/501_calculate_comment_extremity.py	112	53	53%
umap_narrative/502_calculate_priorities.py	135	135	0%
umap_narrative/700_datamapplot_for_layer.py	502	502	0%
umap_narrative/701_static_datamapplot_for_layer.py	310	310	0%
umap_narrative/702_consensus_divisive_datamapplot.py	432	432	0%
umap_narrative/801_narrative_report_batch.py	785	785	0%
umap_narrative/802_process_batch_results.py	265	265	0%
umap_narrative/803_check_batch_status.py	175	175	0%
umap_narrative/llm_factory_constructor/init.py	2	2	0%
umap_narrative/llm_factory_constructor/model_provider.py	157	157	0%
umap_narrative/polismath_commentgraph/init.py	1	0	100%
umap_narrative/polismath_commentgraph/cli.py	270	270	0%
umap_narrative/polismath_commentgraph/core/init.py	3	3	0%
umap_narrative/polismath_commentgraph/core/clustering.py	108	108	0%
umap_narrative/polismath_commentgraph/core/embedding.py	104	104	0%
umap_narrative/polismath_commentgraph/lambda_handler.py	219	219	0%
umap_narrative/polismath_commentgraph/schemas/init.py	2	0	100%
umap_narrative/polismath_commentgraph/schemas/dynamo_models.py	160	9	94%
umap_narrative/polismath_commentgraph/tests/conftest.py	17	17	0%
umap_narrative/polismath_commentgraph/tests/test_clustering.py	74	74	0%
umap_narrative/polismath_commentgraph/tests/test_embedding.py	55	55	0%
umap_narrative/polismath_commentgraph/tests/test_storage.py	87	87	0%
umap_narrative/polismath_commentgraph/utils/init.py	3	0	100%
umap_narrative/polismath_commentgraph/utils/converter.py	283	237	16%
umap_narrative/polismath_commentgraph/utils/group_data.py	354	336	5%
umap_narrative/polismath_commentgraph/utils/storage.py	584	518	11%
umap_narrative/reset_conversation.py	159	50	69%
umap_narrative/run_pipeline.py	453	312	31%
utils/general.py	62	41	34%
Total	10792	7612	29%

Copilot

Pull request overview

Copilot reviewed 7 out of 9 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-30T16:29:39Z

delphi/tests/test_conversation.py

    ], ids=lambda test_desc, *args: test_desc if isinstance(test_desc, str) else str(test_desc))
-    def test_natural_sorting_homogeneous_types(self, test_desc, ptpt_ids, comment_ids, expected_ptpt_types, expected_ptpts_sorted, expected_comment_types, expected_comments_sorted):
-        """Test natural sorting with homogeneous ID types (all same type).
+    def test_natural_sorting_homogeneous_types(self, test_desc, ptpt_ids, comment_ids, expected_ptpt_types, expected_ptpts_ordered, expected_comment_types, expected_comments_sorted):


This test now validates encounter-order preservation rather than “natural sorting”, but the function name still refers to natural sorting. Renaming the test to reflect the new behavior would make failures easier to interpret.

Suggested change

def test_natural_sorting_homogeneous_types(self, test_desc, ptpt_ids, comment_ids, expected_ptpt_types, expected_ptpts_ordered, expected_comment_types, expected_comments_sorted):

def test_homogeneous_types_encounter_order_and_natural_sorting(self, test_desc, ptpt_ids, comment_ids, expected_ptpt_types, expected_ptpts_ordered, expected_comment_types, expected_comments_sorted):

…nt rows Root cause: Python's natsorted() sorted rating matrix rows by PID, while Clojure's NamedMatrix preserves vote-encounter order (insertion order via java.util.Vector). Different row ordering cascades through base-cluster ID assignment into group-level k-means first-k-distinct initialization, producing different local optima and different silhouette landscapes. On vw: PID-sorted order → k=4 (sil=0.508), encounter order → k=2 (sil=0.487). Clojure blob has k=2. After fix, Python also picks k=2. Changes: - update_votes(): track first-appearance order from vote_updates, append new PIDs in encounter order instead of natsorted - _apply_moderation(): preserve raw_rating_mat row order with list comprehension instead of natsorted - Column (comment ID) ordering remains natsorted — column permutation doesn't affect PCA eigenvalues/vectors Results on cold-start blobs: - vw: k=2 exact match (was k=4), sizes [50,17] exact - biodiversity: k=2 exact match, sizes [81,19] exact - bg2018: k=2 match, sizes close ([52,48] vs [51,49]) - FLI: k=3 vs k=2 — inherent PCA divergence (94.5% NaN sparsity, silhouette gap 0.001), not fixable without replicating power iteration Also: re-recorded vw cold-start blob, golden snapshots, updated ordering tests, removed group_clustering xfail, added investigation script. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Plan: replaced investigation section with resolved findings, updated checklist (K-inv DONE), added PR #2453 to cross-reference table - Journal: added session 12 entry with investigation methodology, root cause (natsorted row ordering), fix, and cold-start blob results Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jucor · 2026-03-30T22:54:50Z

Superseded by spr-managed PR stack. See the new stack starting at #2508.

jucor changed the title ~~Fix K-means k divergence: preserve vote-encounter row order~~ [Stack 19/25] Fix K-means k divergence: preserve vote-encounter row order Mar 17, 2026

jucor mentioned this pull request Mar 17, 2026

[Stack 20/27] Fix D15: match Clojure moderation handling (zero out columns, don't remove) #2452

Closed

4 tasks

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from d4e2154 to decac1a Compare March 17, 2026 20:51

jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 5438582 to afcda6d Compare March 18, 2026 18:23

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from decac1a to 7611c85 Compare March 18, 2026 19:02

jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 59ea651 to 084551e Compare March 18, 2026 19:02

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 7611c85 to f51d33f Compare March 18, 2026 19:13

jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch 2 times, most recently from 49e8745 to 4def564 Compare March 19, 2026 10:23

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from f51d33f to 19a64ef Compare March 19, 2026 10:23

jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 4def564 to 9a34efe Compare March 19, 2026 10:49

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 19a64ef to 29283da Compare March 19, 2026 10:49

jucor changed the title ~~[Stack 19/25] Fix K-means k divergence: preserve vote-encounter row order~~ [Stack 18/24] Fix K-means k divergence: preserve vote-encounter row order Mar 19, 2026

jucor marked this pull request as ready for review March 19, 2026 12:08

jucor requested a review from Copilot March 19, 2026 12:11

Copilot AI reviewed Mar 19, 2026

View reviewed changes

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 29283da to 9149555 Compare March 19, 2026 12:33

jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from e4d490a to d372197 Compare March 19, 2026 12:33

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 9149555 to 2b015fd Compare March 19, 2026 14:52

jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from d372197 to e10d972 Compare March 19, 2026 14:52

jucor changed the title ~~[Stack 18/24] Fix K-means k divergence: preserve vote-encounter row order~~ [Stack 19/25] Fix K-means k divergence: preserve vote-encounter row order Mar 19, 2026

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 2b015fd to 176da37 Compare March 23, 2026 15:34

jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from e10d972 to 94b9ebd Compare March 23, 2026 15:35

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 176da37 to 13fedd6 Compare March 23, 2026 15:41

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from e07bd52 to baeccff Compare March 24, 2026 09:04

jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 41a7896 to 7218d03 Compare March 24, 2026 09:04

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch 2 times, most recently from 42db748 to c4d3382 Compare March 24, 2026 11:20

jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 7218d03 to 7bbf0f2 Compare March 24, 2026 11:20

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch 3 times, most recently from 4e68f68 to 7313bc0 Compare March 24, 2026 11:35

jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 7bbf0f2 to 90368ee Compare March 24, 2026 11:38

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 7313bc0 to d8d3f24 Compare March 24, 2026 11:46

jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch 2 times, most recently from 60965ae to 57b6c56 Compare March 26, 2026 21:24

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch 2 times, most recently from 2ce0b36 to 19e36d8 Compare March 27, 2026 01:15

jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 57b6c56 to 4932faa Compare March 27, 2026 01:15

jucor force-pushed the jc/clj-parity-d15-moderation-handling-zeros-vs-removes branch from 19e36d8 to 42a795a Compare March 27, 2026 01:53

jucor force-pushed the jc/clj-parity-kmeans-k-divergence branch from 4932faa to 1a66f28 Compare March 27, 2026 01:53

Copilot AI reviewed Mar 30, 2026

View reviewed changes

jucor and others added 5 commits March 30, 2026 18:04

Remove investigation script (one-off diagnostic, not production code)

bac250d

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Rename k-divergence doc: investigation record, not a handoff

f391ff0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Update references to renamed investigation doc

acb36f0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jucor mentioned this pull request Mar 30, 2026

IGNORE -- crash from spr #2505

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Stack 21/27] Fix K-means k divergence: preserve vote-encounter row order#2453

[Stack 21/27] Fix K-means k divergence: preserve vote-encounter row order#2453
jucor wants to merge 5 commits intojc/clj-parity-d15-moderation-handling-zeros-vs-removesfrom
jc/clj-parity-kmeans-k-divergence

jucor commented Mar 17, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions bot commented Mar 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

jucor commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	def test_natural_sorting_homogeneous_types(self, test_desc, ptpt_ids, comment_ids, expected_ptpt_types, expected_ptpts_ordered, expected_comment_types, expected_comments_sorted):
	def test_homogeneous_types_encounter_order_and_natural_sorting(self, test_desc, ptpt_ids, comment_ids, expected_ptpt_types, expected_ptpts_ordered, expected_comment_types, expected_comments_sorted):

Conversation

jucor commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Investigation findings

Changes

Cold-start blob results

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

github-actions bot commented Mar 30, 2026

Delphi Coverage Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

jucor commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jucor commented Mar 17, 2026 •

edited

Loading