Summary
Dataset::add_columns now tracks newly written, uncommitted fragments and cleans their files if the final merge commit fails.
However, there is still a fragment-level cleanup ownership gap. FileFragment::add_columns calls schema_evolution::add_columns_to_fragments, but it currently discards the returned cleanup metadata (fragments_to_cleanup) and only returns (Fragment, Schema).
As a result, fragment-level callers can successfully write new column files, then fail later during an outer Operation::Merge commit, without having enough ownership information to safely clean up the uncommitted files written by the fragment-level add-columns operation.
Context
The current PR fixes cleanup for dataset-level add_columns failure paths.
This issue tracks the broader follow-up for fragment-level callers such as LanceFragment.merge_columns, where the add-columns work may succeed first and the outer commit may fail later.
Reproducer
A minimal reproducer is:
- Create a dataset with multiple fragments.
- Call
fragment.merge_columns(...) (or FileFragment::add_columns(...)) on one fragment.
- Observe that new column data files are written successfully.
- Advance the dataset version through another operation.
- Attempt to commit the fragment-level merge with a stale
read_version.
- The outer merge commit fails, but the files written by the fragment-level add-columns operation remain in the dataset directory.
Expected Behavior
If a fragment-level add-columns operation writes new files successfully but the later outer merge commit fails, callers should have a safe way to clean up only the files newly written by that failed operation.
Cleanup must not delete:
- pre-existing data files already referenced by the original fragment
- external blob source files
- files belonging to unrelated committed versions
Why This Is Separate From The Current PR
The current PR is intentionally scoped to dataset-level add_columns cleanup.
Fixing this fragment-level case likely requires exposing or preserving cleanup ownership information across the Fragment::add_columns boundary, which is a broader API / ownership follow-up than the current dataset-level cleanup fix.
Possible Directions
Some possible approaches:
- Add a cleanup-aware fragment-level API that returns both the new fragment result and cleanup metadata for newly written, uncommitted files.
- Introduce an internal guard / token object that can be used by the outer caller to clean up if the later merge commit fails.
- Preserve the existing API for compatibility and add a new lower-level API for callers that manage their own outer commit lifecycle.
Acceptance Criteria
- Fragment-level callers can safely clean up newly written add-columns files after an outer commit failure.
- Cleanup only removes files written by the failed fragment-level operation.
- External blob source files are preserved.
- Existing public APIs remain compatible, or any API expansion has a clear migration path.
- Add a regression test covering:
- fragment-level add-columns succeeds,
- a later outer merge commit fails,
- uncommitted files written by the fragment-level operation are cleaned up.
Summary
Dataset::add_columnsnow tracks newly written, uncommitted fragments and cleans their files if the final merge commit fails.However, there is still a fragment-level cleanup ownership gap.
FileFragment::add_columnscallsschema_evolution::add_columns_to_fragments, but it currently discards the returned cleanup metadata (fragments_to_cleanup) and only returns(Fragment, Schema).As a result, fragment-level callers can successfully write new column files, then fail later during an outer
Operation::Mergecommit, without having enough ownership information to safely clean up the uncommitted files written by the fragment-level add-columns operation.Context
The current PR fixes cleanup for dataset-level
add_columnsfailure paths.This issue tracks the broader follow-up for fragment-level callers such as
LanceFragment.merge_columns, where the add-columns work may succeed first and the outer commit may fail later.Reproducer
A minimal reproducer is:
fragment.merge_columns(...)(orFileFragment::add_columns(...)) on one fragment.read_version.Expected Behavior
If a fragment-level add-columns operation writes new files successfully but the later outer merge commit fails, callers should have a safe way to clean up only the files newly written by that failed operation.
Cleanup must not delete:
Why This Is Separate From The Current PR
The current PR is intentionally scoped to dataset-level
add_columnscleanup.Fixing this fragment-level case likely requires exposing or preserving cleanup ownership information across the
Fragment::add_columnsboundary, which is a broader API / ownership follow-up than the current dataset-level cleanup fix.Possible Directions
Some possible approaches:
Acceptance Criteria