From 29b3ce4a7209828bd7368fd44a7aad5735cb573b Mon Sep 17 00:00:00 2001
From: Dipika Ranabhat <dipikaranabhat@Dipikas-MacBook-Pro.local>
Date: Mon, 4 May 2026 15:57:32 -0500
Subject: [PATCH] docs: add data lineage section to
 general-usage/destination-tables
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Fixes #3875 — blog post at dlthub.com/blog/dlt-lineage-support linked to
general-usage/destination-tables#data-lineage which was a 404. Added the
file with a proper Data lineage section covering load IDs, row-level
lineage (_dlt_id/_dlt_parent_id), schema versioning, and a usage example.
---
 general-usage/destination-tables.md | 35 +++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)
 create mode 100644 general-usage/destination-tables.md

diff --git a/general-usage/destination-tables.md b/general-usage/destination-tables.md
new file mode 100644
index 0000000000..7f8b823f6c
--- /dev/null
+++ b/general-usage/destination-tables.md
@@ -0,0 +1,35 @@
+# Destination tables & lineage
+
+> **Full documentation lives at:** [dlthub.com/docs/general-usage/destination-tables](https://dlthub.com/docs/general-usage/destination-tables)
+
+## Data lineage
+
+Data lineage can be super relevant for architectures like the [data vault architecture](https://www.data-vault.co.uk/what-is-data-vault/) or when troubleshooting. The data vault architecture is a data warehouse that large organizations use when representing the same process across multiple systems, which adds data lineage requirements. Using the pipeline name and `load_id` provided out of the box by `dlt`, you are able to identify the source and time of data.
+
+You can save complete lineage info for a particular `load_id` including a list of loaded files, error messages (if any), elapsed times, and schema changes. This can be helpful, for example, when troubleshooting problems.
+
+### Load IDs
+
+Each pipeline run produces a unique `load_id` (a Unix timestamp). This ID appears in every top-level table row and in the `_dlt_loads` system table, letting you trace exactly when and from which source each record was loaded.
+
+### Row-level lineage
+
+Every row in every table gets a `_dlt_id` column — a unique, stable identifier. Child (nested) tables reference their parent rows via `_dlt_parent_id`, forming a complete audit trail from source to destination.
+
+### Schema versioning
+
+dlt tracks schema changes using a content-based `version_hash`. You can correlate a `load_id` to the schema version active at that time, enabling column-level lineage: you can assign the origin of any column to a specific load package, identified by source and time.
+
+### Saving lineage info
+
+```py
+import dlt
+
+pipeline = dlt.pipeline(pipeline_name="my_pipeline", destination="duckdb")
+load_info = pipeline.run(my_source())
+
+# Persist load info back into the destination for lineage tracking
+pipeline.run([load_info], write_disposition="append", table_name="load_info")
+```
+
+For full details see the [hosted documentation](https://dlthub.com/docs/general-usage/destination-tables#data-lineage).