Skip to content

Add JSON Schema for contract YAML to enable editor autocompletion #2

Description

@cschanhniem

Summary

DCVPG uses a custom YAML contract format that is central to the developer experience. However, there is no JSON Schema file that editors (VS Code, JetBrains) can use for autocompletion, inline validation, and documentation tooltips while authoring contracts.

Suggested change

Create a dcvpg-contract-schema.json file that mirrors the full contract YAML format. The schema would be distributed either:

  1. Bundled in the pip package — users reference it locally via # yaml-language-server: $schema=./dcvpg-contract-schema.json
  2. Published to SchemaStore — auto-discovered by VS Code when editing a file named *contract*.yaml or *contract*.yml
  3. Both — ship it and also submit to SchemaStore

What the schema should cover

# yaml-language-server: $schema=./dcvpg-contract-schema.json
contract:
  name: orders_raw
  version: "1.2"
  description: "..."
  owner_team: data-engineering
  source_owner: backend-team
  pipeline_tags: [crm, revenue]
  source_connection: postgres_main
  source_table: orders
  row_count_min: 1000
  row_count_max: 5000000
  sla_freshness_hours: 6
  schema:
    - field: id
      type: integer
      nullable: false
      unique: true
      # ...type-specific constraints like min, max, allowed_values, format
  custom_rules:
    - rule: no_weekend_orders.NoWeekendOrders
      params:
        date_field: created_at

Why this matters

  1. Developer experience — writing contracts by hand is the primary authoring path. Autocompletion for field types (integer, float, string, timestamp), enum values (nullable, unique), and structural requirements (which fields are required vs optional) would reduce errors.
  2. Onboarding — new users can discover the contract format directly in their editor without flipping between the CLI and docs.
  3. CI validation — the JSON Schema could double as a fast validation pass before running the full engine (quick structural checks vs expensive data validation).

Implementation notes

  • The schema can be auto-derived from the Pydantic models that already define the contract format
  • Ship it at dcvpg/schemas/dcvpg-contract-schema.json and add a CLI command dcvpg schema to output it
  • Submit to https://schemastore.org for auto-discovery

Happy to contribute the initial schema if this direction is approved.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions