feat: clickhouse staging-optimized#3927
Conversation
|
Heya @zilto is there anything I can do to help move this forward? |
rudolfix
left a comment
There was a problem hiding this comment.
@filipesilva this is pretty cool, I didn't notice you can swap tables in clickhouse. I think we should not truncate table in replace job. see my review. thanks!
| staging_table_name = sql_client.make_qualified_table_name(table["name"]) | ||
| table_name = sql_client.make_qualified_table_name(table["name"]) | ||
| sql.append(f"EXCHANGE TABLES {staging_table_name} AND {table_name}") | ||
| sql.append(f"TRUNCATE TABLE {staging_table_name}") |
There was a problem hiding this comment.
do not truncate tables here. this makes that job non idempotent. if truncation fails the job will be retried and you'll exchange again and truncate the table with data. - dlt truncates staging dataset before the load. also user can do that with the sql client
| monkeypatch: pytest.MonkeyPatch, | ||
| ) -> None: | ||
| """Test ClickHouse atomic swap via EXCHANGE TABLES with sequential loads, nested tables, and empty resource.""" | ||
| from dlt.destinations.sql_jobs import SqlStagingFollowupJob, SqlStagingReplaceFollowupJob |
| assert {int(r["id"]) for r in table_dicts["items"]} == {100, 101, 102} | ||
|
|
||
| # third load: schema evolution adds a new column, EXCHANGE must work after ALTER | ||
| @dlt.resource(name="items", write_disposition="replace", primary_key="id") |
There was a problem hiding this comment.
is this the core thing that this test is checking? otherwise - we have test that will check optimized replace for all destinations that enable it.
if so this is clickhouse specific test (we check if EXCHANGE works after ALTER so this tests clickhouse engine not dlt) - you can still keep it but plese move to load/pipeline/test_clickhouse.py
There was a problem hiding this comment.
Yes it's not testing anything clickhouse specific. Will remove. Thanks for the pointer!
|
@rudolfix thanks for taking the time to review, the comments should be addressed now. |
Description
Adds the
staging-optimizedreplace strategy to the Clickhouse destination, using the EXCHANGE TABLES statement for atomic swaps.Related Issues
Additional Context
Python is not my primary language, and I have used LLM agent assistance to produce this PR.
I have tested it locally with a local clickhouse, but wasn't able to test the
*-staging-s3-*and*-staging-az-*tests because those seem to need CI credentials.