Skip to content

[Stack 8/27] Deep analysis of Python-Clojure discrepancies and fix plan#2419

Closed
jucor wants to merge 7 commits intojc/uv-pip-cifrom
jc/kmeans_analysis_docs
Closed

[Stack 8/27] Deep analysis of Python-Clojure discrepancies and fix plan#2419
jucor wants to merge 7 commits intojc/uv-pip-cifrom
jc/kmeans_analysis_docs

Conversation

@jucor
Copy link
Copy Markdown
Collaborator

@jucor jucor commented Mar 5, 2026

Summary

Stacked on #2484 (Speed up CI: replace pip with uv pip in Dockerfile (~2x faster installs)). Please review and merge #2484 first.
Next in stack: #2420 (Per-discrepancy test infrastructure)

Documentation-only PR: deep analysis of Python vs Clojure discrepancies and a TDD fix plan.

Changes

  • Deep analysis documents (deep-analysis-for-julien/) comparing Python and Clojure implementations statement-by-statement
  • Consolidate CLAUDE.md documentation for the delphi project
  • Discrepancy fix plan (docs/PLAN_DISCREPANCY_FIXES.md) with prioritized list of fixes

Test plan

  • Documentation only — no code changes
    🤖 Generated with Claude Code

@jucor jucor force-pushed the jc/kmeans_clustering_tooling branch from e35f31a to ad8a92c Compare March 6, 2026 15:31
@jucor jucor force-pushed the jc/kmeans_analysis_docs branch from 74d276e to cda5015 Compare March 6, 2026 15:34
@jucor jucor force-pushed the jc/kmeans_clustering_tooling branch from ad8a92c to 00c659e Compare March 10, 2026 11:12
@jucor jucor force-pushed the jc/kmeans_analysis_docs branch from cda5015 to f4136dd Compare March 10, 2026 11:12
@jucor jucor force-pushed the jc/kmeans_clustering_tooling branch from 00c659e to 69cf22e Compare March 10, 2026 12:29
@jucor jucor force-pushed the jc/kmeans_analysis_docs branch from f4136dd to d965abb Compare March 10, 2026 12:29
@jucor jucor force-pushed the jc/kmeans_clustering_tooling branch from 69cf22e to ecebe3b Compare March 10, 2026 14:09
@jucor jucor force-pushed the jc/kmeans_analysis_docs branch from d965abb to 3886fb5 Compare March 10, 2026 14:13
@jucor jucor force-pushed the jc/kmeans_clustering_tooling branch from ecebe3b to ea26a44 Compare March 10, 2026 15:18
@jucor jucor force-pushed the jc/kmeans_analysis_docs branch from 3886fb5 to 000bb0a Compare March 10, 2026 15:39
@jucor jucor changed the base branch from jc/kmeans_clustering_tooling to jc/participant-id-unfolding March 10, 2026 15:47
@jucor jucor requested review from ballPointPenguin and whilo March 10, 2026 16:08
@jucor jucor changed the title Deep analysis of Python-Clojure discrepancies and fix plan [Stack 6/8] Deep analysis of Python-Clojure discrepancies and fix plan Mar 10, 2026
@jucor jucor changed the title [Stack 6/8] Deep analysis of Python-Clojure discrepancies and fix plan [Stack 6/9] Deep analysis of Python-Clojure discrepancies and fix plan Mar 11, 2026
@jucor jucor changed the title [Stack 6/9] Deep analysis of Python-Clojure discrepancies and fix plan [Stack 6/10] Deep analysis of Python-Clojure discrepancies and fix plan Mar 11, 2026
@jucor jucor changed the title [Stack 6/10] Deep analysis of Python-Clojure discrepancies and fix plan [Stack 6/11] Deep analysis of Python-Clojure discrepancies and fix plan Mar 11, 2026
@jucor jucor requested a review from Copilot March 13, 2026 12:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Documentation-only PR capturing a deep, statement-by-statement analysis of Python vs Clojure math pipeline discrepancies and a proposed TDD plan to bring Python to parity, plus consolidating Delphi’s CLAUDE.md guidance.

Changes:

  • Add deep-analysis-for-julien/ docs describing discrepancies across PCA, clustering, repness, participant filtering, and comment routing.
  • Add a prioritized, PR-by-PR TDD execution plan in delphi/docs/PLAN_DISCREPANCY_FIXES.md.
  • Update delphi/CLAUDE.md to consolidate documentation pointers and add testing/regression guidance.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
delphi/docs/PLAN_DISCREPANCY_FIXES.md Concrete staged plan for discrepancy-driven TDD fixes and verification workflow.
delphi/CLAUDE.md Consolidates Delphi doc guidance; adds a testing/regression overview.
deep-analysis-for-julien/01-overview-and-architecture.md High-level architecture and data-flow overview across Python/Clojure implementations.
deep-analysis-for-julien/02-pca-analysis.md PCA implementation comparison and sources of divergence.
deep-analysis-for-julien/03-clustering-analysis.md Clustering/k-selection comparison (incl. k-smoother) and format notes.
deep-analysis-for-julien/04-repness-analysis.md Repness math comparison and discrepancy breakdown.
deep-analysis-for-julien/05-participant-filtering.md In-conv filtering + comment priorities/vote structures comparison.
deep-analysis-for-julien/06-comment-routing.md TypeScript comment routing analysis and dependency on comment-priorities.
deep-analysis-for-julien/07-discrepancies.md Canonical list of all identified discrepancies with severity and locations.
deep-analysis-for-julien/08-dead-code.md Inventory of dead/unreachable code and known issues.
deep-analysis-for-julien/09-fix-plan.md Prioritized fix plan and phased rollout guidance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jucor jucor changed the title [Stack 6/11] Deep analysis of Python-Clojure discrepancies and fix plan [Stack 6/12] Deep analysis of Python-Clojure discrepancies and fix plan Mar 13, 2026
@jucor jucor changed the title [Stack 6/12] Deep analysis of Python-Clojure discrepancies and fix plan [Stack 6/13] Deep analysis of Python-Clojure discrepancies and fix plan Mar 13, 2026
@jucor jucor changed the title [Stack 6/13] Deep analysis of Python-Clojure discrepancies and fix plan [Stack 6/15] Deep analysis of Python-Clojure discrepancies and fix plan Mar 16, 2026
@jucor jucor force-pushed the jc/participant-id-unfolding branch from b2f145d to 48afbe4 Compare March 16, 2026 16:04
@jucor jucor force-pushed the jc/kmeans_analysis_docs branch from 000bb0a to f323584 Compare March 16, 2026 16:04
@jucor jucor force-pushed the jc/kmeans_analysis_docs branch from 6c3dc7b to 839b3c7 Compare March 27, 2026 02:10
@jucor jucor force-pushed the jc/cold-start-tooling branch 2 times, most recently from 130a7b8 to 16fd44b Compare March 27, 2026 10:41
@jucor jucor force-pushed the jc/kmeans_analysis_docs branch from 839b3c7 to 11831d4 Compare March 27, 2026 10:41
@jucor jucor changed the title [Stack 6/25] Deep analysis of Python-Clojure discrepancies and fix plan [Stack 7/26] Deep analysis of Python-Clojure discrepancies and fix plan Mar 30, 2026
@jucor jucor force-pushed the jc/kmeans_analysis_docs branch from 11831d4 to 2d235be Compare March 30, 2026 12:48
@jucor jucor force-pushed the jc/cold-start-tooling branch from 16fd44b to 1fe0cfa Compare March 30, 2026 12:48
@jucor jucor changed the title [Stack 7/26] Deep analysis of Python-Clojure discrepancies and fix plan [Stack 8/27] Deep analysis of Python-Clojure discrepancies and fix plan Mar 30, 2026
@jucor jucor force-pushed the jc/cold-start-tooling branch from 1fe0cfa to e74ff24 Compare March 30, 2026 12:54
@jucor jucor force-pushed the jc/kmeans_analysis_docs branch from 2d235be to b979d12 Compare March 30, 2026 12:54
Base automatically changed from jc/cold-start-tooling to jc/uv-pip-ci March 30, 2026 12:54
jucor and others added 6 commits March 30, 2026 14:32
Comprehensive 9-document analysis covering:
- Architecture and data flow comparison
- PCA implementation details (power iteration vs SVD)
- Two-level clustering with k-smoother analysis
- Representativeness metrics with formula verification
- Participant filtering and comment priority systems
- TypeScript comment routing (prioritized vs topical)
- 15 identified discrepancies rated by severity
- Dead code inventory across Python codebase
- Prioritized fix plan to bring Python to Clojure parity

Key findings: 5 CRITICAL discrepancies (in-conv threshold,
proportion test formula, repness metric, z-score thresholds,
missing comment priorities), 3 HIGH, and 7 MEDIUM/LOW.

https://claude.ai/code/session_01FEPFmVHKz1eoqzvXSTmu14
- Doc 02: Correct Clojure sparsity-aware projection code to match actual
  implementation (raw votes with nils skipped, not imputed). Add new
  discrepancy D1b documenting subtle projection input difference.
- Doc 03: Fix same-clustering convergence check description - it checks
  every pairwise distance < threshold, not sum.
- Doc 07: Add D1b projection input discrepancy (LOW severity).

https://claude.ai/code/session_01FEPFmVHKz1eoqzvXSTmu14
- Remove AGENTS.md and CLAUDE.md/GEMINI.md symlinks
- Create delphi/CLAUDE.md as a real file with upstream content plus
  testing, regression, and sklearn migration sections
- Add outdated docs warning
- Move uv-specific instructions to CLAUDE.local.md (gitignored)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…lien into delphi/docs/

- Move deep-analysis-for-julien/ from repo root to delphi/docs/
- Fix broken reference to 09-fix-plan.md (now relative path)
- Fix double period in plan doc
- Fix Pre-requisite → Prerequisite
- Canonicalize journal filename to CLJ-PARITY-FIXES-JOURNAL.md
- Add note in 09-fix-plan.md that canonical ordering is in PLAN_DISCREPANCY_FIXES.md
- Replace 'adequate command line flag' with '--include-local' in CLAUDE.md
- Add 'start here' pointer to trusted docs in CLAUDE.md warning
- Annotate extremtiy as intentional (matches Clojure source typo)
@jucor jucor force-pushed the jc/kmeans_analysis_docs branch from b979d12 to fa0395b Compare March 30, 2026 16:48
@github-actions
Copy link
Copy Markdown

Delphi Coverage Report

File Stmts Miss Cover
init.py 2 0 100%
benchmarks/bench_pca.py 76 76 0%
benchmarks/bench_repness.py 81 81 0%
benchmarks/bench_update_votes.py 38 38 0%
benchmarks/benchmark_utils.py 34 34 0%
components/init.py 1 0 100%
components/config.py 165 133 19%
conversation/init.py 2 0 100%
conversation/conversation.py 1118 336 70%
conversation/manager.py 131 42 68%
database/init.py 1 0 100%
database/dynamodb.py 387 233 40%
database/postgres.py 305 205 33%
pca_kmeans_rep/init.py 5 0 100%
pca_kmeans_rep/clusters.py 257 22 91%
pca_kmeans_rep/corr.py 98 17 83%
pca_kmeans_rep/pca.py 52 16 69%
pca_kmeans_rep/repness.py 361 48 87%
pca_kmeans_rep/stats.py 107 22 79%
regression/init.py 4 0 100%
regression/clojure_comparer.py 188 17 91%
regression/comparer.py 887 720 19%
regression/datasets.py 103 22 79%
regression/recorder.py 36 27 25%
regression/utils.py 137 118 14%
run_math_pipeline.py 260 114 56%
umap_narrative/500_generate_embedding_umap_cluster.py 210 109 48%
umap_narrative/501_calculate_comment_extremity.py 112 54 52%
umap_narrative/502_calculate_priorities.py 135 135 0%
umap_narrative/700_datamapplot_for_layer.py 502 502 0%
umap_narrative/701_static_datamapplot_for_layer.py 310 310 0%
umap_narrative/702_consensus_divisive_datamapplot.py 432 432 0%
umap_narrative/801_narrative_report_batch.py 785 785 0%
umap_narrative/802_process_batch_results.py 265 265 0%
umap_narrative/803_check_batch_status.py 175 175 0%
umap_narrative/llm_factory_constructor/init.py 2 2 0%
umap_narrative/llm_factory_constructor/model_provider.py 157 157 0%
umap_narrative/polismath_commentgraph/init.py 1 0 100%
umap_narrative/polismath_commentgraph/cli.py 270 270 0%
umap_narrative/polismath_commentgraph/core/init.py 3 3 0%
umap_narrative/polismath_commentgraph/core/clustering.py 108 108 0%
umap_narrative/polismath_commentgraph/core/embedding.py 104 104 0%
umap_narrative/polismath_commentgraph/lambda_handler.py 219 219 0%
umap_narrative/polismath_commentgraph/schemas/init.py 2 0 100%
umap_narrative/polismath_commentgraph/schemas/dynamo_models.py 160 9 94%
umap_narrative/polismath_commentgraph/tests/conftest.py 17 17 0%
umap_narrative/polismath_commentgraph/tests/test_clustering.py 74 74 0%
umap_narrative/polismath_commentgraph/tests/test_embedding.py 55 55 0%
umap_narrative/polismath_commentgraph/tests/test_storage.py 87 87 0%
umap_narrative/polismath_commentgraph/utils/init.py 3 0 100%
umap_narrative/polismath_commentgraph/utils/converter.py 283 237 16%
umap_narrative/polismath_commentgraph/utils/group_data.py 354 336 5%
umap_narrative/polismath_commentgraph/utils/storage.py 584 477 18%
umap_narrative/reset_conversation.py 159 50 69%
umap_narrative/run_pipeline.py 453 312 31%
utils/general.py 62 41 34%
Total 10919 7646 30%

This was referenced Mar 30, 2026
@jucor
Copy link
Copy Markdown
Collaborator Author

jucor commented Mar 30, 2026

Superseded by spr-managed PR stack. See the new stack starting at #2508.

@jucor jucor closed this Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants