Summary
DCVPG uses a custom YAML contract format that is central to the developer experience. However, there is no JSON Schema file that editors (VS Code, JetBrains) can use for autocompletion, inline validation, and documentation tooltips while authoring contracts.
Suggested change
Create a dcvpg-contract-schema.json file that mirrors the full contract YAML format. The schema would be distributed either:
- Bundled in the pip package — users reference it locally via
# yaml-language-server: $schema=./dcvpg-contract-schema.json
- Published to SchemaStore — auto-discovered by VS Code when editing a file named
*contract*.yaml or *contract*.yml
- Both — ship it and also submit to SchemaStore
What the schema should cover
# yaml-language-server: $schema=./dcvpg-contract-schema.json
contract:
name: orders_raw
version: "1.2"
description: "..."
owner_team: data-engineering
source_owner: backend-team
pipeline_tags: [crm, revenue]
source_connection: postgres_main
source_table: orders
row_count_min: 1000
row_count_max: 5000000
sla_freshness_hours: 6
schema:
- field: id
type: integer
nullable: false
unique: true
# ...type-specific constraints like min, max, allowed_values, format
custom_rules:
- rule: no_weekend_orders.NoWeekendOrders
params:
date_field: created_at
Why this matters
- Developer experience — writing contracts by hand is the primary authoring path. Autocompletion for field types (
integer, float, string, timestamp), enum values (nullable, unique), and structural requirements (which fields are required vs optional) would reduce errors.
- Onboarding — new users can discover the contract format directly in their editor without flipping between the CLI and docs.
- CI validation — the JSON Schema could double as a fast validation pass before running the full engine (quick structural checks vs expensive data validation).
Implementation notes
- The schema can be auto-derived from the Pydantic models that already define the contract format
- Ship it at
dcvpg/schemas/dcvpg-contract-schema.json and add a CLI command dcvpg schema to output it
- Submit to https://schemastore.org for auto-discovery
Happy to contribute the initial schema if this direction is approved.
Summary
DCVPG uses a custom YAML contract format that is central to the developer experience. However, there is no JSON Schema file that editors (VS Code, JetBrains) can use for autocompletion, inline validation, and documentation tooltips while authoring contracts.
Suggested change
Create a
dcvpg-contract-schema.jsonfile that mirrors the full contract YAML format. The schema would be distributed either:# yaml-language-server: $schema=./dcvpg-contract-schema.json*contract*.yamlor*contract*.ymlWhat the schema should cover
Why this matters
integer,float,string,timestamp), enum values (nullable,unique), and structural requirements (which fields are required vs optional) would reduce errors.Implementation notes
dcvpg/schemas/dcvpg-contract-schema.jsonand add a CLI commanddcvpg schemato output itHappy to contribute the initial schema if this direction is approved.