[Stack 12/17] Fix D5: match Clojure prop_test formula (Wilson-score-like with +1 pseudocount) by jucor · Pull Request #2519 · compdemocracy/polis

jucor · 2026-03-30T22:25:18Z

Summary

Replace Python's standard one-proportion z-test prop_test(p, n, p0) with
Clojure's Wilson-score-like formula prop_test(succ, n) from stats.clj:10-15:

2 * sqrt(n+1) * ((succ+1)/(n+1) - 0.5)

The Clojure formula has a built-in +1 pseudocount (Laplace smoothing / Beta(1,1)
prior) that regularizes extreme values for small Polis groups. This is separate
from the PSEUDO_COUNT=2.0 used for pa/pd estimation (Beta(2,2) prior):

pa = (na + 1) / (ns + 2) — Beta(2,2) prior for probability estimation
pat = 2 * sqrt(ns+1) * ((na+1)/(ns+1) - 0.5) — Beta(1,1) prior for significance testing

What changed in the output: pat, pdt values (proportion test z-scores),
and downstream agree_metric / disagree_metric values. The z-scores are
now slightly different due to sqrt(n+1) vs sqrt(n) and (succ+1)/(n+1) vs
(na+1)/(n+2) denominators.

Changes

repness.py: prop_test(p, n, p0) → prop_test(succ, n) with Clojure formula
repness.py: prop_test_vectorized(p, n, p0) → prop_test_vectorized(succ, n)
repness.py: Callers updated to pass raw counts (na, ns) instead of (pa, ns, 0.5)
test_discrepancy_fixes.py: Removed xfail from D5 formula test, added 8 test cases + edge case
test_repness_unit.py, test_old_format_repness.py: Updated for new signature
Golden snapshots re-recorded for all datasets

Test plan

D5 formula tests pass (8 input pairs + edge cases)
D5 Clojure blob consistency check passes (all datasets)
Full test suite passes (public + private, 19/19 regression tests)
Only pre-existing failure: pakistan-incremental D2 (unrelated)

🤖 Generated with Claude Code

Squashed commits

RED: add D5 blob injection test (prop_test vs Clojure p-test values)
Fix D5: match Clojure prop_test formula (Wilson-score-like with +1 pseudocount)
Update plan and journal: mark D5 as done
Plan: add D5 PR number and stack position to cross-reference

commit-id:48b77ba3

Stack:

⚠️ Part of a stack created by spr. Do not merge manually using the UI - doing so may have unexpected results.

…eudocount) ## Summary Replace Python's standard one-proportion z-test `prop_test(p, n, p0)` with Clojure's Wilson-score-like formula `prop_test(succ, n)` from `stats.clj:10-15`: ``` 2 * sqrt(n+1) * ((succ+1)/(n+1) - 0.5) ``` The Clojure formula has a built-in +1 pseudocount (Laplace smoothing / Beta(1,1) prior) that regularizes extreme values for small Polis groups. This is separate from the `PSEUDO_COUNT=2.0` used for `pa`/`pd` estimation (Beta(2,2) prior): - `pa = (na + 1) / (ns + 2)` — Beta(2,2) prior for probability estimation - `pat = 2 * sqrt(ns+1) * ((na+1)/(ns+1) - 0.5)` — Beta(1,1) prior for significance testing **What changed in the output**: `pat`, `pdt` values (proportion test z-scores), and downstream `agree_metric` / `disagree_metric` values. The z-scores are now slightly different due to `sqrt(n+1)` vs `sqrt(n)` and `(succ+1)/(n+1)` vs `(na+1)/(n+2)` denominators. ## Changes - `repness.py`: `prop_test(p, n, p0)` → `prop_test(succ, n)` with Clojure formula - `repness.py`: `prop_test_vectorized(p, n, p0)` → `prop_test_vectorized(succ, n)` - `repness.py`: Callers updated to pass raw counts `(na, ns)` instead of `(pa, ns, 0.5)` - `test_discrepancy_fixes.py`: Removed xfail from D5 formula test, added 8 test cases + edge case - `test_repness_unit.py`, `test_old_format_repness.py`: Updated for new signature - Golden snapshots re-recorded for all datasets ## Test plan - [x] D5 formula tests pass (8 input pairs + edge cases) - [x] D5 Clojure blob consistency check passes (all datasets) - [x] Full test suite passes (public + private, 19/19 regression tests) - [x] Only pre-existing failure: pakistan-incremental D2 (unrelated) 🤖 Generated with [Claude Code](https://claude.com/claude-code) ## Squashed commits - RED: add D5 blob injection test (prop_test vs Clojure p-test values) - Fix D5: match Clojure prop_test formula (Wilson-score-like with +1 pseudocount) - Update plan and journal: mark D5 as done - Plan: add D5 PR number and stack position to cross-reference commit-id:48b77ba3

github-actions · 2026-03-31T01:00:44Z

Delphi Coverage Report

File	Stmts	Miss	Cover
init.py	2	0	100%
benchmarks/bench_pca.py	76	76	0%
benchmarks/bench_repness.py	81	81	0%
benchmarks/bench_update_votes.py	38	38	0%
benchmarks/benchmark_utils.py	34	34	0%
components/init.py	1	0	100%
components/config.py	165	133	19%
conversation/init.py	2	0	100%
conversation/conversation.py	1107	320	71%
conversation/manager.py	131	42	68%
database/init.py	1	0	100%
database/dynamodb.py	387	234	40%
database/postgres.py	305	205	33%
pca_kmeans_rep/init.py	5	0	100%
pca_kmeans_rep/clusters.py	257	22	91%
pca_kmeans_rep/corr.py	98	17	83%
pca_kmeans_rep/pca.py	52	16	69%
pca_kmeans_rep/repness.py	297	38	87%
regression/init.py	4	0	100%
regression/clojure_comparer.py	188	17	91%
regression/comparer.py	887	720	19%
regression/datasets.py	135	27	80%
regression/recorder.py	36	27	25%
regression/utils.py	138	94	32%
run_math_pipeline.py	260	114	56%
umap_narrative/500_generate_embedding_umap_cluster.py	210	109	48%
umap_narrative/501_calculate_comment_extremity.py	112	53	53%
umap_narrative/502_calculate_priorities.py	135	135	0%
umap_narrative/700_datamapplot_for_layer.py	502	502	0%
umap_narrative/701_static_datamapplot_for_layer.py	310	310	0%
umap_narrative/702_consensus_divisive_datamapplot.py	432	432	0%
umap_narrative/801_narrative_report_batch.py	785	785	0%
umap_narrative/802_process_batch_results.py	265	265	0%
umap_narrative/803_check_batch_status.py	175	175	0%
umap_narrative/llm_factory_constructor/init.py	2	2	0%
umap_narrative/llm_factory_constructor/model_provider.py	157	157	0%
umap_narrative/polismath_commentgraph/init.py	1	0	100%
umap_narrative/polismath_commentgraph/cli.py	270	270	0%
umap_narrative/polismath_commentgraph/core/init.py	3	3	0%
umap_narrative/polismath_commentgraph/core/clustering.py	108	108	0%
umap_narrative/polismath_commentgraph/core/embedding.py	104	104	0%
umap_narrative/polismath_commentgraph/lambda_handler.py	219	219	0%
umap_narrative/polismath_commentgraph/schemas/init.py	2	0	100%
umap_narrative/polismath_commentgraph/schemas/dynamo_models.py	160	9	94%
umap_narrative/polismath_commentgraph/tests/conftest.py	17	17	0%
umap_narrative/polismath_commentgraph/tests/test_clustering.py	74	74	0%
umap_narrative/polismath_commentgraph/tests/test_embedding.py	55	55	0%
umap_narrative/polismath_commentgraph/tests/test_storage.py	87	87	0%
umap_narrative/polismath_commentgraph/utils/init.py	3	0	100%
umap_narrative/polismath_commentgraph/utils/converter.py	283	237	16%
umap_narrative/polismath_commentgraph/utils/group_data.py	354	336	5%
umap_narrative/polismath_commentgraph/utils/storage.py	584	518	11%
umap_narrative/reset_conversation.py	159	50	69%
umap_narrative/run_pipeline.py	453	312	31%
utils/general.py	62	41	34%
Total	10770	7620	29%

jucor changed the title ~~Fix D5: match Clojure prop_test formula (Wilson-score-like with +1 pseudocount)~~ [Stack 12/17] Fix D5: match Clojure prop_test formula (Wilson-score-like with +1 pseudocount) Mar 30, 2026

jucor force-pushed the spr/edge/48b77ba3 branch 2 times, most recently from cd39374 to a387b9e Compare March 30, 2026 22:47

jucor force-pushed the spr/edge/0194003d branch from 24de40d to add1343 Compare March 31, 2026 00:35

jucor force-pushed the spr/edge/48b77ba3 branch from a387b9e to 956e3a8 Compare March 31, 2026 00:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Stack 12/17] Fix D5: match Clojure prop_test formula (Wilson-score-like with +1 pseudocount)#2519

[Stack 12/17] Fix D5: match Clojure prop_test formula (Wilson-score-like with +1 pseudocount)#2519
jucor wants to merge 1 commit intospr/edge/0194003dfrom
spr/edge/48b77ba3

jucor commented Mar 30, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jucor commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Squashed commits

Uh oh!

github-actions bot commented Mar 31, 2026

Delphi Coverage Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jucor commented Mar 30, 2026 •

edited

Loading