Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# Prek automatic checks
.prek/local.toml
.prek/.state.toml
.prek/.enabled

# Claude Code
.worktrees
Expand Down
153 changes: 153 additions & 0 deletions .prek/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
# Pre-push local verification

> **Not a general prek/pre-commit rollout.** prek is used only to install a **pre-push** hook; gate logic lives in `tools/prek.py`. This is not incremental formatting or per-file lint on staged changes — it runs full `make fl` / `make test-common-p` when the hook decides they are needed.

Optional pre-push checks: run `make fl` and/or `make test-common-p` on push when the current tree’s
fingerprint is not already in local pass history (up to 50 recorded passes per check in
`.prek/.state.toml`).

**Manual commands always run.** `make fl` and `make test-common-p` execute in full whenever you
invoke them. The cache applies only to pre-push (`git push`), `make prek`, and `make prek-dry`.
Successful manual runs still record passes (after hook install) so the next push can skip.

The hook is **opt-in**. Without `.prek/local.toml`, the pre-push hook does nothing.

## Quick start

1. Copy `local.example.toml` to `local.toml` and set each check `mode` (`off`, `auto`, or `confirm`).
2. From the repo root: `make install-prepush-hooks`
3. Push as usual. Bypass once: `git push --no-verify`
4. Preview without pushing: `make prek-dry`. Run the gate manually: `make prek`
5. Remove the hook: `make uninstall-prepush-hooks`

## Prerequisites

- Root dev env: `make dev` (for `make fl` and `make test-common-p`)
- Docs Python env: `cd docs && make dev` (once). `make fl` also runs `npm install` in `docs/website` for Biome.
- Optional `[gate] only_when_pr_open = true`: requires [GitHub CLI](https://cli.github.com/) (`gh auth login`)

## Configuration (`local.toml`)

Copy from `local.example.toml`. Gitignored — per-developer only.

| Section | Keys | Meaning |
|---------|------|---------|
| `[gate]` | `only_when_pr_open` | If `true`, skip all checks unless the current branch has an open PR (`gh pr view`) |
| `[lint]` | `mode` | How to handle a stale lint fingerprint |
| `[test_common_p]` | `mode` | How to handle a stale common-test fingerprint |

### Modes

| Mode | When the fingerprint is stale |
|------|---------------------|
| `off` | Never run this check |
| `auto` | Run the make target |
| `confirm` | Ask on the terminal; declining aborts the push |

**Confirm prompts** look like: `Run make fl before push? [Y/n] `

- Enter or `y` / `yes` → run the check
- `n` / `no` → abort push
- Non-interactive stdin (no TTY) → treated as declined (push blocked)

## What runs

Checks run in order; a failed lint blocks tests.

| Check | Make target | Command recorded in state |
|-------|-------------|---------------------------|
| `lint` | `fl` | `make fl` |
| `test_common_p` | `test-common-p` | `make test-common-p` |

`make fl` runs format (root, docs, website deps) in parallel, then root and docs lint in parallel.

A check runs only when its **fingerprint** (hash of tracked files listed for that check) is not among the
last 50 successful passes stored in `.prek/.state.toml` (also gitignored). Each pass records
`fingerprint`, `passed_at`, and `command`. That history lets branch switches and reverts reuse
a prior pass when the tree matches again. Passing prepends a record and trims the list
to 50 entries per check.

Example:

```toml
[[lint.passes]]
fingerprint = "abc123..."
passed_at = "2026-05-29T12:00:00+00:00"
command = "make fl"
```

After `make install-prepush-hooks`, successful `make fl` and `make test-common-p` also update
state (no extra commands). Plain `make lint` does not update prek state.

## Unstaged changes on push

prek may stash unstaged edits to `~/.cache/prek/patches/` while the hook runs, then restore them.
Built-in prek behavior (from pre-commit), not configurable. Keeps lint/tests from failing on WIP you
are not pushing.

## Fingerprint inputs (`pyproject.toml`)

Defines which tracked files feed each check fingerprint (`[tool.dlt.prepush.fingerprints.lint]` and
`[tool.dlt.prepush.fingerprints.test_common_p]`). Edit when adding new trees that should trigger
re-lint or re-test.

**Lint** — `dlt`, `tests`, `tools`, `docs` (`.py`, `.md`, `.ipynb`), plus root/docs config and
embedded-snippet lint setup files.

**Common tests** — `dlt` and selected `tests/*` suites (see `[tool.dlt.prepush.fingerprints]` in
`pyproject.toml`), plus `pyproject.toml`,
`uv.lock`, `tests/conftest.py`, `tests/load/test_dummy_client.py`.

Inspect a fingerprint:

```bash
uv run python -m tools.prek fingerprint lint
uv run python -m tools.prek fingerprint test_common_p
```

## Makefile targets

| Target | Purpose |
|--------|---------|
| `make install-prepush-hooks` | Install prek pre-push hook and enable state recording (fails if another pre-push hook exists) |
| `make uninstall-prepush-hooks` | Remove the prek pre-push hook (no-op if none; fails if hook is not from prek) |
| `make prek` | Run the gate now (same logic as on push) |
| `make prek-dry` | Show what would run; no make, no state update |
| `make fl` | Format root + docs (parallel), then lint root + docs (parallel) |

prek is installed with `uv tool install`, not as a repo dependency.

## Troubleshooting

**Existing pre-push hook** — `make install-prepush-hooks` refuses to install if `.git/hooks/pre-push`
already exists and is not from prek. prek cannot share the hook file with another tool. Remove or
relocate your hook first, or skip prek setup for now.

**Uninstall with a foreign hook** — `make uninstall-prepush-hooks` only removes a prek-managed hook.
If `.git/hooks/pre-push` exists but was not installed via `make install-prepush-hooks`, uninstall
refuses to run so your hook is not deleted.

**Hook never runs checks** — Ensure `.prek/local.toml` exists and at least one check has `mode` not
`off`. Run `make prek-dry` to see whether the gate is active and which checks are stale.

**Gate skipped** — With `only_when_pr_open = true`, there must be an open PR on the current branch.

**Docs lint fails** — Run `cd docs && make dev`, then `make fl` (or `cd docs && make format && make lint`).

**Want to re-run after a pass** — Delete `.prek/.state.toml`, remove all `[[lint.passes]]` or
`[[test_common_p.passes]]` entries for that check, or change a tracked file in that check’s fingerprint inputs.

**Stale fingerprint / wrong cache** — Same as above. State keeps up to 50 pass records per check
for branch hopping.

## Files in this directory

| File | Role |
|------|------|
| `README.md` | This guide |
| `local.example.toml` | Config template |
| `prek.toml` | prek hook definition (`uv run python -m tools.prek`) |

Implementation and tests: `tools/prek.py` (run via `python -m tools.prek`).

Gitignored: `local.toml`, `.state.toml`, `.enabled`
8 changes: 8 additions & 0 deletions .prek/local.example.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
[gate]
only_when_pr_open = false # if true, run checks only when the current branch has an open PR

[lint]
mode = "auto" # off | auto | confirm

[test_common_p]
mode = "off" # off | auto | confirm
11 changes: 11 additions & 0 deletions .prek/prek.toml
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configuration can be defined in pyproject.toml. This would reduce file sprawl.

If devs want to override stuff locally, then can:

  1. not install the prek hook (those are installed under .git/); this PR will have no effect on them
  2. override prek config with a local config that is not committed

Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[[repos]]
repo = "local"

[[repos.hooks]]
id = "pre-push-gate"
name = "dlt pre-push gate"
language = "system"
entry = "uv run python -m tools.prek"
pass_filenames = false
always_run = true
stages = ["pre-push"]
7 changes: 7 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,11 @@ Our goal is to maintain stability and compatibility across all environments. Ple

`dlt` uses `mypy` and `flake8` (with several plugins) for linting. You can run the linter locally with `make lint`. We also run a code formatter with `black` which you can run with `make format`. The lint step will also ensure that the code is formatted correctly. It is good practice to run `make format && make lint` before every commit.

### Pre-push hooks (optional)

You can run `make fl` and/or `make test-common-p` automatically before each push when tracked
files in scope change. Setup and configuration: [`.prek/README.md`](.prek/README.md) (`make install-prepush-hooks`).

## Testing

`dlt` uses `pytest` for testing.
Expand Down Expand Up @@ -180,6 +185,8 @@ If, for any reason, you need to access the `pytest-xdist` worker id, do it with

You can view our GitHub Actions setup in `.github/workflows` to see which tests are run with which dependencies / extras installed, and which platforms and python versions are used for linting and testing. The main entry point is `.github/workflows/main.yml` which orchestrates all other workflows. Certain dependencies exist, for example no tests will be run if the linter reports problems. Some workflows use test matrixes to test several destinations or run tests on various operating systems and with various python versions or dependency resolution strategies. To reduce CI execution time and improve feedback cycles, parallel test execution via `pytest-xdist` has been enabled in CI. Try to run any test suite that is involved in your development work in parallel if possible, since that is how it will be run in CI. Some CI tests have been restricted the number of workers due to destination performance reasons.

PR label `test-destinations-early`: jobs that normally wait for `test_common` (destination, sources, dbt runner, etc.) start in parallel with lint instead. Lint and common still run serially. Use this label only if you run lint and common locally (see [`.prek/README.md`](.prek/README.md)).

### Common Components

To test components that don’t require external resources, run:
Expand Down
60 changes: 58 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
.DEFAULT_GOAL := help
.PHONY: install-uv has-uv dev lint test test-common test-common-p reset-test-storage recreate-compiled-deps build-library-prerelease build-library publish-library test-load-local test-load-local-p test-load-local-postgres test-load-local-postgres-p install-snowflake-extras test-remote-snowflake test-remote-snowflake-p install-common-core test-common-core install-common-core-source test-common-core-source install-common-source install-pipeline-min test-pipeline-min install-pipeline-arrow test-pipeline-arrow install-pipeline-min-arrow test-pipeline-min-arrow install-workspace test-workspace test-workspace-dashboard install-hub-minimal test-hub-minimal test-hub install-pipeline-full test-pipeline-full install-pipeline-full-sql test-pipeline-full-sql install-sqlalchemy2 test-with-sqlalchemy-2 test-dest-load test-dest-remote-essential test-dest-remote-nonessential test-dbt-no-venv test-dbt-runner-venv test-sources-load test-sources-sql-database
.PHONY: install-uv has-uv dev lint lint-parallel lint-full lint-docs format format-docs docs-website-deps fl test test-common test-common-p reset-test-storage recreate-compiled-deps build-library-prerelease build-library publish-library test-load-local test-load-local-p test-load-local-postgres test-load-local-postgres-p install-snowflake-extras test-remote-snowflake test-remote-snowflake-p install-common-core test-common-core install-common-core-source test-common-core-source install-common-source install-pipeline-min test-pipeline-min install-pipeline-arrow test-pipeline-arrow install-pipeline-min-arrow test-pipeline-min-arrow install-workspace test-workspace test-workspace-dashboard install-hub-minimal test-hub-minimal test-hub install-pipeline-full test-pipeline-full install-pipeline-full-sql test-pipeline-full-sql install-sqlalchemy2 test-with-sqlalchemy-2 test-dest-load test-dest-remote-essential test-dest-remote-nonessential test-dbt-no-venv test-dbt-runner-venv test-sources-load test-sources-sql-database install-prepush-hooks uninstall-prepush-hooks prek prek-dry

PYV=$(shell python3 -c "import sys;t='{v[0]}.{v[1]}'.format(v=list(sys.version_info[:2]));sys.stdout.write(t)")
.SILENT:has-uv
Expand Down Expand Up @@ -37,7 +37,30 @@ dev-airflow: has-uv ## Prepares development environment with airflow support
dev-hub: has-uv ## Prepares development environment with hub support
uv sync --all-extras --group workspace-deps --group dev --group providers --group pipeline --group sources --group sentry-sdk --group ibis --group adbc --group dashboard-tests

lint: lint-core lint-security lint-docstrings lint-lock lint-deps ## Runs all linters (mypy, ruff, flake8, bandit, docstrings, lockfile, deps)
LINT_TARGETS := lint-core lint-security lint-docstrings lint-lock lint-deps

lint: $(LINT_TARGETS) ## Runs all linters (mypy, ruff, flake8, bandit, docstrings, lockfile, deps)
@:

lint-parallel: ## Runs all linters in parallel (used by make fl)
$(MAKE) -j $(words $(LINT_TARGETS)) lint

lint-full: lint lint-docs ## Root + docs lint (sequential)

lint-docs: ## Runs docs linting (embedded snippets, notebooks, docs tooling)
cd docs && $(MAKE) lint

format-docs: ## Formats docs tooling, website, examples, and notebooks
cd docs && $(MAKE) format

docs-website-deps: ## Install docs website node deps (biome; used by make fl)
cd docs/website && npm install

fl: ## Format then lint root and docs in parallel (prek pre-push gate)
set -e; \
$(MAKE) format & $(MAKE) format-docs & $(MAKE) docs-website-deps & wait; \
$(MAKE) lint-parallel & $(MAKE) -C docs lint-parallel & wait
@if [ -f .prek/.enabled ]; then uv run python -m tools.prek --record lint; fi

lint-lock: ## Checks uv lockfile is in sync
uv lock --check
Expand Down Expand Up @@ -151,11 +174,13 @@ TEST_COMMON_PATHS = \
tests/libs \
tests/destinations

test-common: PYTEST_MARKERS = not rfam
test-common: ## Tests common components without external resources
$(call RUN_XDIST_SAFE_SPLIT,$(TEST_COMMON_PATHS))

test-common-p: ## Tests common components in parallel
$(MAKE) test-common PYTEST_XDIST_N=auto
@if [ -f .prek/.enabled ]; then uv run python -m tools.prek --record test_common_p; fi

# ----------------------------------------------------------------------
# Local load tests
Expand Down Expand Up @@ -432,3 +457,34 @@ test-e2e-dashboard-headed: ## Runs dashboard e2e tests with visible browser

create-test-pipelines: ## Creates test pipelines for manual dashboard testing
uv run python tests/workspace/helpers/dashboard/example_pipelines.py

PREK_VERSION ?= 0.4.2

prek: ## Run pre-push gate now (same as the git hook)
uv run python -m tools.prek

prek-dry: ## Show what the pre-push gate would run (no make, no state update)
uv run python -m tools.prek --dry-run

install-prepush-hooks: ## Install prek pre-push hook (fails if another pre-push hook exists)
@if [ -f .git/hooks/pre-push ] && ! grep -Fq 'File generated by prek' .git/hooks/pre-push 2>/dev/null; then \
echo "Error: .git/hooks/pre-push already exists."; \
echo "prek is not compatible with an existing pre-push hook."; \
echo "Remove or relocate your hook, then run make install-prepush-hooks again."; \
exit 1; \
fi
uv tool install prek@$(PREK_VERSION)
prek install --hook-type pre-push --config .prek/prek.toml -f
@touch .prek/.enabled

uninstall-prepush-hooks: ## Remove prek pre-push hook (no-op if none; fails if hook is not from prek)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uv run prek uninstall

@if [ ! -f .git/hooks/pre-push ]; then \
echo "No pre-push hook to remove."; \
elif ! grep -Fq 'File generated by prek' .git/hooks/pre-push 2>/dev/null; then \
echo "Error: .git/hooks/pre-push exists but was not installed by make install-prepush-hooks."; \
echo "make uninstall-prepush-hooks will not remove it."; \
exit 1; \
else \
prek uninstall --hook-type pre-push --config .prek/prek.toml; \
fi
@rm -f .prek/.enabled
9 changes: 8 additions & 1 deletion docs/Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
.DEFAULT_GOAL := help
.PHONY: lint lint-parallel lint-core lint-embedded-snippets lint-notebooks lint-website

# Add " ## description" after any target name to include it in `make help` output.
# Example: my-target: ## Does something useful
Expand All @@ -8,7 +9,13 @@ help: ## Shows this help message
dev: ## Prepares development environment for docs tooling
uv sync

lint: lint-core lint-embedded-snippets lint-notebooks lint-website ## Runs all linters
LINT_TARGETS := lint-core lint-embedded-snippets lint-notebooks lint-website

lint: $(LINT_TARGETS) ## Runs all linters
@:

lint-parallel: ## Runs all linters in parallel (used by make fl)
$(MAKE) -j $(words $(LINT_TARGETS)) lint

lint-website: ## Lints the docusaurus website JS/TS sources with Biome
cd website && npm run lint
Expand Down
Loading
Loading