Add SQL Server CDC capture demo#38
Conversation
Self-contained Docker stack (SQL Server + datagen + ngrok) and a flow.yaml spec for streaming dbo.sales into an Estuary collection via CDC. Built from the existing sqlserver-cdc-materialize example, minus the Materialize destination. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
aeluce
left a comment
There was a problem hiding this comment.
Noted a few optional places to streamline things.
| @@ -0,0 +1,5 @@ | |||
| collections: | |||
| dani-demo/sqlserver-cdc/dbo/sales: | |||
There was a problem hiding this comment.
Possible nit, since the readme instructions do call it out, but it could be helpful to signpost that the prefix needs to change, like using <your-prefix>/ rather than dani-demo/.
The auto-generated directory structure also includes dani-demo as a folder. Along with the capture flow.yaml, it's not too many places the user needs to hunt down and replace text, but could be simplified.
| -- Create the watermarks table if it doesn't already exist and grant permissions | ||
| IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'dbo' AND TABLE_NAME = 'flow_watermarks') | ||
| BEGIN | ||
| CREATE TABLE dbo.flow_watermarks(slot INT PRIMARY KEY, watermark NVARCHAR(255)); |
There was a problem hiding this comment.
SQL Server CDC docs no longer call out watermarks table creation: it looks like read-only mode is now the default for SQL Server captures. The steps for a watermarks table could be removed as well as the mention in the readme.
| ```bash | ||
| # (Re-)run discovery to refresh bindings from the source — optional once | ||
| # bindings are present. | ||
| flowctl raw discover --source flow.yaml |
There was a problem hiding this comment.
flowctl discover is now a top-level command, not just nested under raw.
- Replace dani-demo with your-prefix placeholder to signpost prefix swap - Drop flow_watermarks setup (read-only mode is now the default) - Use top-level flowctl discover instead of flowctl raw discover Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@aeluce thanks for the review! Addressed all three comments in 8f5c87f:
Ready for another look when you have a sec. |
Summary
sqlserver-cdc-capture/example: Docker stack (SQL Server 2022 + Python datagen + ngrok) with CDC pre-enabled and theflow_captureuser provisioned.flow.yamlfor an Estuary SQL Server capture, plus the discovered tenantdani-demo/sqlserver-cdc/...collection spec.sqlserver-cdc-materialize, dropping the Materialize destination; healthcheck and init script updated to usemssql-tools18(the oldermssql-toolspath is no longer in the upstream image).Test plan
docker compose up -d sql-server datagenbrings the DB up healthy and the datagen starts producing inserts/updates/deletesdocker compose up -d ngrokwithNGROK_AUTHTOKENset exposes 1433 publiclyflowctl catalog publish --source flow.yaml --auto-approvesucceeds against the running stackflowctl catalog status dani-demo/sqlserver-cdc/source-sqlserver→OKflowctl collections read --collection dani-demo/sqlserver-cdc/dbo/sales --uncommittedreturns documents🤖 Generated with Claude Code