test: contrib Delta full test battery + CI workflow [Delta contrib split, part 6]#10
Draft
schenksj wants to merge 2 commits into
Draft
test: contrib Delta full test battery + CI workflow [Delta contrib split, part 6]#10schenksj wants to merge 2 commits into
schenksj wants to merge 2 commits into
Conversation
schenksj
added a commit
that referenced
this pull request
Jun 21, 2026
…viewed clean (fork #10); %-path dropped
…lit, part 6] Adds the remaining contrib-delta Scala test battery and the dedicated CI workflow that runs it, completing the test coverage for the Delta read path landed in parts 1-5. What this adds (test-only -- no production or native code): - 23 contrib-delta repro/audit/regression suites (22 under org.apache.comet.contrib.delta + CometDeltaCheckpointFilterReproSuite under org.apache.spark.sql.delta), copied verbatim from the integration branch. These are behaviour guards: deletion-vector reads, DPP, row tracking, generated-column partition filters, stats skipping, time travel, schema change, nested array/struct, type round-trip, special-char/percent file names, metadata/credential/filter-pushdown audits, etc. - .github/workflows/delta_contrib_test.yml: builds libcomet once with --features contrib-delta, then runs every contrib suite (matched by package prefix) across (Spark 3.5 + Delta 3.3.2), (Spark 4.0 + Delta 4.0.0) and (Spark 4.1 + Delta 4.1.0), plus the build-gate verification job. - dev/ci/check-suites.py: the contrib-suite exclusion is hoisted ahead of the class-name extraction (contrib suites compile only under -Pcontrib-delta and run in their own workflow, so they are exempt from the standard-matrix registration check). Workflow hardening (review-driven, improving on the integration branch): - Pin each cell's exact Spark patch via -Dspark.version=<matrix.full>. Without this the -Pspark-4.1 profile pulls Spark 4.1.2, which dropped IgnoreCachedData and breaks delta-spark 4.1.0; the contrib needs 4.1.1. (Pom stays at 4.1.2 for default users -- the pin is CI-only, per the part-2 decision.) - Label the Spark 3.5 cell as Scala 2.12 (its real binary version from the -Pspark-3.5 profile). It is intentionally the project's only 2.12 coverage -- it guards 2.12-specific breakage such as the existential-type inference in the core DeltaIntegration bridge that 2.13 accepts but 2.12 rejects. - Cache contrib/delta/native/target so the standalone contrib crate's cargo test build is incremental across runs (the crate is outside the native/ workspace). - Add a silent-green guard: scalatest treats a zero-match wildcardSuites as success, so assert a floor on the per-suite surefire reports actually produced. Removes .github/workflows/delta_build_gate.yml: the minimal standalone gate workflow from part 2 is now subsumed by the delta-build-gate job inside delta_contrib_test.yml (byte-identical job), so the full workflow replaces it. The deferred local-path '%'/space production change is intentionally NOT included: CometDeltaPercentFileNameReproSuite and CometDeltaSpecialCharFilenameSuite both pass without it (object_store round-trips percent-encoded local paths), so the change is a confirmed no-op and is dropped. Verification: gated JVM test-compile (all 31 contrib suites); full battery green (157 succeeded, 0 failed, 1 version-gated cancel across 33 suites on Spark 4.1 + Delta 4.1.0, the cell whose -Dspark.version=4.1.1 command this workflow now issues); spotless + scalastyle clean; check-suites.py exit 0; dev/verify-contrib-delta-gate.sh all checks pass (default libcomet 0 Delta symbols). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BtErWgRQKCDRAg8Mk6qR4G
…on row-tracking guards [folded into A.6a]
b94530b to
fb47ff3
Compare
9a1ef06 to
f514d2c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Part 6 of the Delta Lake contrib PR breakup (stacked on part 5 / #9). Fork-local review draft.
Completes the test coverage for the Delta read path landed in parts 1–5. Test-only — no production or native code.
What this adds
org.apache.comet.contrib.delta+CometDeltaCheckpointFilterReproSuiteunderorg.apache.spark.sql.delta), carved byte-identical from the integration branch. Behaviour guards: deletion-vector reads, DPP, row tracking, generated-column partition filters, stats skipping, time travel, schema change, nested array/struct, type round-trip, special-char/percent filenames, metadata/credential/filter-pushdown audits..github/workflows/delta_contrib_test.yml: buildslibcometonce with--features contrib-delta, then runs every contrib suite (by package prefix) across (Spark 3.5 + Delta 3.3.2), (Spark 4.0 + Delta 4.0.0), (Spark 4.1 + Delta 4.1.0), plus the build-gate job.dev/ci/check-suites.py: contrib-suite exclusion hoisted ahead of class-name extraction (contrib suites run in their own workflow, exempt from the standard-matrix registration check).Workflow hardening (review-driven; improves on the integration branch)
-Dspark.version=<matrix.full>. The-Pspark-4.1profile otherwise pulls Spark 4.1.2, which droppedIgnoreCachedDataand breaks delta-spark 4.1.0 (needs 4.1.1). The pom stays at 4.1.2 for default users — the pin is CI-only per the part-2 decision.DeltaIntegrationbridge that 2.13 accepts but 2.12 rejects.contrib/delta/native/targetso the standalone contrib crate'scargo testbuild is incremental.wildcardSuitesas success, so the job now asserts a floor on the per-suite surefire reports produced.Removed
.github/workflows/delta_build_gate.yml— the part-2 standalone gate workflow is now subsumed by the byte-identicaldelta-build-gatejob insidedelta_contrib_test.yml.Dropped (deferred
%-path change)The local-path
%/space production change is not included:CometDeltaPercentFileNameReproSuiteandCometDeltaSpecialCharFilenameSuiteboth pass without it (object_store round-trips percent-encoded local paths), so it is a confirmed no-op.Verification
Gated JVM test-compile (all 31 contrib suites); full battery green — 157 succeeded, 0 failed, 1 version-gated cancel across 33 suites on Spark 4.1 + Delta 4.1.0 (the exact
-Dspark.version=4.1.1command this workflow issues); spotless + scalastyle clean;check-suites.pyexit 0;dev/verify-contrib-delta-gate.shall checks pass (defaultlibcomet0 Delta symbols).🤖 This PR was prepared with the assistance of Claude (Anthropic). A human author reviewed and is responsible for the content.