[Stack 7/17] Fix D4: pseudocount formula by jucor · Pull Request #2514 · compdemocracy/polis

jucor · 2026-03-30T22:25:12Z

Summary

Change PSEUDO_COUNT from 1.5 to 2.0, matching Clojure's Beta(2,2) prior
This changes probability smoothing from pa = (na + 0.75)/(ns + 1.5) to pa = (na + 1)/(ns + 2)
All pa/pd values now match Clojure's p-success exactly (verified on all datasets with Clojure blobs)

Changes

repness.py: PSEUDO_COUNT = 2.0 with updated comment
test_discrepancy_fixes.py: remove xfail from 3 D4 tests (constant check, pa values per dataset, synthetic)
test_repness_unit.py, test_old_format_repness.py: import PSEUDO_COUNT instead of hardcoding 1.5
simplified_repness_test.py: update hardcoded constant
Golden snapshots re-recorded for public datasets (vw, biodiversity)

Test plan

TDD red: 6 D4 tests fail before fix
TDD green: all 6 D4 tests pass after fix
Full public suite: 258 passed, 0 failures
Private datasets (--include-local): 60 passed, 0 failures (discrepancy tests)
Regression tests pass on public + FLI + bg2018

🤖 Generated with Claude Code

Squashed commits

Fix D4: PSEUDO_COUNT 1.5 → 2.0 to match Clojure's Beta(2,2) prior
Journal: add session 6 (D4 fix), update plan marking D4 done

commit-id:6ae3ee43

Stack:

⚠️ Part of a stack created by spr. Do not merge manually using the UI - doing so may have unexpected results.

## Summary - Change `PSEUDO_COUNT` from 1.5 to 2.0, matching Clojure's Beta(2,2) prior - This changes probability smoothing from `pa = (na + 0.75)/(ns + 1.5)` to `pa = (na + 1)/(ns + 2)` - All `pa`/`pd` values now match Clojure's `p-success` exactly (verified on all datasets with Clojure blobs) ## Changes - `repness.py`: `PSEUDO_COUNT = 2.0` with updated comment - `test_discrepancy_fixes.py`: remove xfail from 3 D4 tests (constant check, pa values per dataset, synthetic) - `test_repness_unit.py`, `test_old_format_repness.py`: import `PSEUDO_COUNT` instead of hardcoding 1.5 - `simplified_repness_test.py`: update hardcoded constant - Golden snapshots re-recorded for public datasets (vw, biodiversity) ## Test plan - [x] TDD red: 6 D4 tests fail before fix - [x] TDD green: all 6 D4 tests pass after fix - [x] Full public suite: 258 passed, 0 failures - [x] Private datasets (--include-local): 60 passed, 0 failures (discrepancy tests) - [x] Regression tests pass on public + FLI + bg2018 🤖 Generated with [Claude Code](https://claude.com/claude-code) ## Squashed commits - Fix D4: PSEUDO_COUNT 1.5 → 2.0 to match Clojure's Beta(2,2) prior - Journal: add session 6 (D4 fix), update plan marking D4 done commit-id:6ae3ee43

github-actions · 2026-03-31T00:43:01Z

Delphi Coverage Report

File	Stmts	Miss	Cover
init.py	2	0	100%
benchmarks/bench_pca.py	76	76	0%
benchmarks/bench_repness.py	81	81	0%
benchmarks/bench_update_votes.py	38	38	0%
benchmarks/benchmark_utils.py	34	34	0%
components/init.py	1	0	100%
components/config.py	165	133	19%
conversation/init.py	2	0	100%
conversation/conversation.py	1117	328	71%
conversation/manager.py	131	42	68%
database/init.py	1	0	100%
database/dynamodb.py	387	234	40%
database/postgres.py	305	205	33%
pca_kmeans_rep/init.py	5	0	100%
pca_kmeans_rep/clusters.py	257	22	91%
pca_kmeans_rep/corr.py	98	17	83%
pca_kmeans_rep/pca.py	52	16	69%
pca_kmeans_rep/repness.py	361	51	86%
pca_kmeans_rep/stats.py	107	22	79%
regression/init.py	4	0	100%
regression/clojure_comparer.py	188	17	91%
regression/comparer.py	887	720	19%
regression/datasets.py	135	27	80%
regression/recorder.py	36	27	25%
regression/utils.py	137	118	14%
run_math_pipeline.py	260	114	56%
umap_narrative/500_generate_embedding_umap_cluster.py	210	109	48%
umap_narrative/501_calculate_comment_extremity.py	112	54	52%
umap_narrative/502_calculate_priorities.py	135	135	0%
umap_narrative/700_datamapplot_for_layer.py	502	502	0%
umap_narrative/701_static_datamapplot_for_layer.py	310	310	0%
umap_narrative/702_consensus_divisive_datamapplot.py	432	432	0%
umap_narrative/801_narrative_report_batch.py	785	785	0%
umap_narrative/802_process_batch_results.py	265	265	0%
umap_narrative/803_check_batch_status.py	175	175	0%
umap_narrative/llm_factory_constructor/init.py	2	2	0%
umap_narrative/llm_factory_constructor/model_provider.py	157	157	0%
umap_narrative/polismath_commentgraph/init.py	1	0	100%
umap_narrative/polismath_commentgraph/cli.py	270	270	0%
umap_narrative/polismath_commentgraph/core/init.py	3	3	0%
umap_narrative/polismath_commentgraph/core/clustering.py	108	108	0%
umap_narrative/polismath_commentgraph/core/embedding.py	104	104	0%
umap_narrative/polismath_commentgraph/lambda_handler.py	219	219	0%
umap_narrative/polismath_commentgraph/schemas/init.py	2	0	100%
umap_narrative/polismath_commentgraph/schemas/dynamo_models.py	160	9	94%
umap_narrative/polismath_commentgraph/tests/conftest.py	17	17	0%
umap_narrative/polismath_commentgraph/tests/test_clustering.py	74	74	0%
umap_narrative/polismath_commentgraph/tests/test_embedding.py	55	55	0%
umap_narrative/polismath_commentgraph/tests/test_storage.py	87	87	0%
umap_narrative/polismath_commentgraph/utils/init.py	3	0	100%
umap_narrative/polismath_commentgraph/utils/converter.py	283	237	16%
umap_narrative/polismath_commentgraph/utils/group_data.py	354	336	5%
umap_narrative/polismath_commentgraph/utils/storage.py	584	477	18%
umap_narrative/reset_conversation.py	159	50	69%
umap_narrative/run_pipeline.py	453	312	31%
utils/general.py	62	41	34%
Total	10950	7647	30%

jucor changed the title ~~Fix D4: pseudocount formula~~ [Stack 7/17] Fix D4: pseudocount formula Mar 30, 2026

jucor force-pushed the spr/edge/6ae3ee43 branch 2 times, most recently from 4ad6046 to 603f0ac Compare March 30, 2026 22:47

jucor force-pushed the spr/edge/6ae3ee43 branch from 603f0ac to b9dcc89 Compare March 31, 2026 00:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Stack 7/17] Fix D4: pseudocount formula#2514

[Stack 7/17] Fix D4: pseudocount formula#2514
jucor wants to merge 1 commit intospr/edge/c0a682ecfrom
spr/edge/6ae3ee43

jucor commented Mar 30, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jucor commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Squashed commits

Uh oh!

github-actions bot commented Mar 31, 2026

Delphi Coverage Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jucor commented Mar 30, 2026 •

edited

Loading