Skip to content

MCP Server for DQX#1252

Open
souravg-db2 wants to merge 57 commits into
mainfrom
dqx/mcp
Open

MCP Server for DQX#1252
souravg-db2 wants to merge 57 commits into
mainfrom
dqx/mcp

Conversation

@souravg-db2

@souravg-db2 souravg-db2 commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Changes

Adds an MCP (Model Context Protocol) server for DQX, exposing DQX's data-quality
capabilities as tools that any MCP-compatible AI agent (Claude, Genie Code, Cursor, Mosaic AI)
can discover and orchestrate. It runs as a Databricks App with on-behalf-of (OBO)
authentication, so all data access is governed by the calling user's Unity Catalog permissions.

Architecture

  • Databricks App + FastMCP server, exposed over Streamable HTTP. The server itself carries
    no PySpark/DQX dependency.
  • OBO governance: for any data access the server creates a temporary definer's-rights view
    over the source table using the user's forwarded token (X-Forwarded-Access-Token), so the
    service principal reads data as the user, never directly.
  • Async job pattern: long-running operations are submitted to a pre-deployed runner notebook
    job
    and return a run_id immediately; the client polls get_run_result. get_run_status does
    a single non-blocking poll (the client drives cadence) — this matches MCP's canonical
    long-running-tool model and avoids holding the HTTP connection / saturating the worker pool.
  • Genie Code compatibility: stateless_http + json_response + CORS preflight scoped to
    Databricks domains.

Tools

Tool Purpose
get_workflow Recommended sequence of tool calls (call first)
get_table_schema Table columns/types (direct, OBO)
profile_table Profile data → summary stats + profiles
generate_rules Generate checks from profiler output
generate_rules_from_contract Generate checks from an ODCS data contract (deterministic)
validate_checks Validate check definitions
run_checks Execute checks → counts + per-rule summary + failing sample
apply_checks_and_save_to_table Operationalized run: write valid/quarantine rows to Delta tables
save_checks / load_checks Persist / retrieve a rule set (table, UC volume, or workspace file)
list_available_checks Discover all built-in DQX check functions
get_run_result Poll an async job for status/results

Temp-view lifecycle (stateless, restart/replica-safe)

  • The runner job drops its own input view in a finally (it runs as the SP, which owns the
    temp schema
    — see setup), so cleanup happens in the guaranteed job execution regardless of
    whether/where the user polls.
  • A throttled TTL sweeper (timestamped v_<epoch>_<uuid> view names) reaps any orphans whose
    job never ran. No per-request server state is kept, so app restarts / multiple replicas don't
    leak views or lose context.

Security & governance

  • Data-access boundary enforced by OBO definer's-rights views; the app SP owns only the scratch
    tmp schema (object lifecycle), not the underlying data.
  • CORS restricted to Databricks workspace/app domains with credentials; SQL identifiers validated
    and backtick-quoted; log values sanitized (CWE-117); catalog name sourced from a Databricks secret.

Deployment

  • Databricks Asset Bundle (databricks.yml): runner job, one-time setup job (UC grants + schema
    ownership), and the app. requirements.txt (the App's runtime manifest) and pyproject.toml are
    kept in sync; the runner installs databricks-labs-dqx[datacontract].
  • New make targets and fmt/CI wiring for the sub-project.

Linked issues

Resolves #1045

Tests

  • added unit tests

Layered MCP-server test suite (62 passing, deterministic, no workspace needed for CI):

  • Handler unit tests (test_tools.py)
  • Protocol tests via FastMCP's in-memory Client over the real MCP protocol (test_mcp_protocol.py)
  • HTTP integration over the real ASGI app — health, initialize/tools-list, and OBO token
    propagation (test_app_http.py); CORS policy (test_cors.py)
  • Agent-in-the-loop integration (gated/skipped without a deployed server + workspace LLM
    endpoint): a tool-calling model is handed the tool schemas and an instruction, and we assert it
    discovers and invokes the right tools (test_integration_agent.py)

Documentation and Demos

  • added/updated docs (docs/dqx/docs/guide/dqx_mcp_server.mdx)

This description was written by Isaac.

@codecov

codecov Bot commented Jun 15, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.42%. Comparing base (c98da6c) to head (6cfb1ef).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1252      +/-   ##
==========================================
- Coverage   92.53%   92.42%   -0.11%     
==========================================
  Files         102      102              
  Lines       10075    10075              
==========================================
- Hits         9323     9312      -11     
- Misses        752      763      +11     
Flag Coverage Δ
anomaly 54.37% <ø> (ø)
anomaly-serverless 54.38% <ø> (ø)
integration 49.38% <ø> (+1.09%) ⬆️
integration-serverless 50.54% <ø> (-0.09%) ⬇️
unit 57.02% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

✅ 774/774 passed, 49 skipped, 5h55m37s total

Running from acceptance #4973

@mwojtyczka mwojtyczka added the under-review This PR is currently being reviewed by one of DQX maintainers. label Jun 15, 2026
@mwojtyczka mwojtyczka self-requested a review June 15, 2026 17:30
@mwojtyczka mwojtyczka changed the title Dqx/mcp MCP Server for DQX Jun 15, 2026
@github-actions

github-actions Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

✅ 194/194 passed, 2 skipped, 6h59m0s total

Running from anomaly #1087

Comment thread mcp-server/databricks.yml Outdated

@mwojtyczka mwojtyczka left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: MCP Server for DQX

Strong PR — the OBO + definer's-rights-view + SP-job architecture is the right pattern, and it neatly avoids needing a Jobs scope at the user auth level. Nice touches: the 4.5 MB notebook.exit() guard, pure-ASGI OBO middleware, idempotent setup grants, and a fixed OPERATIONS dispatch (no arbitrary code exec).

A few things to address before merge — inline comments below. Two summary-level points:

Rebase on main (stale base). The root pyproject.toml mypy-comment change reintroduces a reference to apx dev check, but PR #1223 (already merged) removed apx and replaced it with first-party scripts (bun run tsc -b + basedpyright). The branch is behind main; please rebase so this drift is resolved.

Add an integration test with Genie (see inline on the tests). app.py is explicitly tuned for "Genie Code compatibility" (stateless_http, json_response, CORS preflight) — but nothing exercises that path. An end-to-end test that drives the MCP server through Databricks Genie (Genie Code as the MCP client) over the deployed app would protect exactly the behaviour these settings exist for.

Priorities: the rebase are blockers; CORS and the in-memory run-state are important follow-ups.

Comment thread mcp-server/server/utils.py Outdated
Comment thread mcp-server/server/tools.py Outdated
Comment thread mcp-server/server/app.py
Comment thread mcp-server/server/utils.py Outdated
Comment thread mcp-server/server/tools.py Outdated
Comment thread mcp-server/server/main.py Outdated
Comment thread mcp-server/pyproject.toml Outdated
Comment thread mcp-server/databricks.yml Outdated

mock_job.assert_called_once_with("list_available_checks", {})

def test_get_workflow_returns_steps(self):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an integration test with Genie. The server is explicitly built for "Genie Code compatibility" (stateless_http/json_response/CORS preflight in app.py), but no test exercises that end-to-end path. An integration test that registers the deployed MCP app with Databricks Genie and drives the documented workflow (get_table_schema → profile_table → generate_rules → run_checks, polling get_run_result) would protect the OBO header propagation and the async submit/poll contract that these settings exist for. Unit coverage of the server layer here is otherwise good.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will evaluate feasibility of this, I doubt if we have any APIs yet for Genie Code. The UCode integration is using just API Gateway https://github.com/databricks/ucode

}

@mcp_server.tool
def generate_rules(profiles: list[dict], criticality: str = "error"):

@mwojtyczka mwojtyczka Jun 16, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good to provide option to generate rules based on user input as well, both are supported today in DQX Core. I would leave one method and make user input optional

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good shout. Let's think about this @souravg-db2 - there are two approaches for this -

  1. we already have a purpose built DQX Agent which has DQX specific instructions and validation we wire that as a tool?
  2. another is the calling Agent itself analyses the user need and only leverages granular tools that we provide via this MCP. I just think perhaps the output would be wildly different on basis of the quality of the calling agent.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mwojtyczka @souravg-db2 - I tried this out I think I am leaning on leaving the MCP granular and deterministic with composable primitives. seems to be what Anthropic also advises https://www.anthropic.com/engineering/writing-tools-for-agents

@mwojtyczka mwojtyczka Jun 26, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, we can make a follow up for this. I still think this would be very important way on how people can interact with the agent, make it more interactive

@mwojtyczka mwojtyczka left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going in the right direction, left some comments to address

vb-dbrks added 8 commits June 24, 2026 18:55
Adds two test layers above the existing handler unit tests, per the testing-pyramid
research:

- test_mcp_protocol.py: drives the server through FastMCP's in-memory Client over the
  REAL MCP protocol (capability negotiation, tool registration/schemas, call_tool
  dispatch) with the workspace boundary faked. Asserts all 12 tools are exposed with
  schemas and that representative tools behave correctly through the protocol — coverage
  the direct-handler tests could not provide.
- test_app_http.py: integration tests over the real ASGI app (combined_app) via Starlette
  TestClient — health route, MCP initialize/tools-list over Streamable HTTP, and OBO token
  propagation (header -> OBOAuthMiddleware -> contextvar -> get_obo_client), including
  rejection when the forwarded token is absent. Closes the "add a Genie integration test"
  review request. CORS remains covered by test_cors.py.

Full mcp-server suite: 62 passed. No new test dependencies (starlette/fastmcp only).

Co-authored-by: Isaac
Adds an LLM-in-the-loop integration test (MCPEval-style) that connects a real MCP
client to the deployed server, hands a tool-calling Databricks model-serving endpoint
the server's tool schemas plus a natural-language instruction, and asserts the model
DISCOVERS and INVOKES the right tools — verifying the tools are usable by an arbitrary
agent, not just our own code.

Follows the DQX integration patterns: gated/skipped unless a deployed server + workspace
LLM endpoint are configured (DQX_MCP_SERVER_URL / DATABRICKS_HOST / DATABRICKS_TOKEN),
with a reachability probe-and-skip mirroring the anomaly ai_query_endpoint fixture, and
structural/trajectory assertions (tool was called, sensible final answer) rather than
exact-match on non-deterministic LLM output. Registers the 'integration' pytest marker.

Verified live against the deployed mcp-dqx-vb: the model autonomously called
get_table_schema and answered with the real columns. Skips cleanly in unit CI
(62 passed, 1 skipped).

Co-authored-by: Isaac
The CI fmt job runs `make fmt` (black ., ruff --fix, mypy, pylint) then
`git diff --exit-code`. black traverses into mcp-server/ (only [tool.mypy]
excludes it, not [tool.black]), and three recently-added files were not
black-formatted. Reformat them via black so the fmt gate passes — no lint
suppressions or config exclusions added.

Verified: `make fmt` now leaves the tree clean (black 353 unchanged, ruff
passed, mypy ok, pylint 10.00/10); mcp-server suite still 62 passed, 1 skipped.

Co-authored-by: Isaac
…gration test

Previously the agent-in-the-loop test pointed at the shared, pre-existing
samples.nyctaxi.trips and created no resources — so it did not follow DQX's
scaffold-and-teardown convention. Add a `dq_test_table` yield-fixture that creates
its own isolated schema+table via the Databricks SDK, yields the FQN, and drops the
table on teardown (runs even if the test fails) — replicating the pytester
factory(create, delete) guarantee. The mcp-server test env lacks the root project's
pytester fixtures, so the equivalent is built with the SDK.

The test now asserts the agent reports the columns of OUR scaffolded table, removing
the dependency on an external sample table. Gated on DQX_MCP_TEST_CATALOG in addition
to the server/host/token env. Verified live against the field-eng workspace: table
created, agent read it back, table dropped (no residue); make fmt clean; suite still
62 passed / 1 skipped without env.

Co-authored-by: Isaac
The runner job previously installed databricks-labs-dqx[datacontract] from PyPI,
so it ran a published release rather than the source in this repo — drift between
what's tested/deployed and what the PR actually changes.

Mirror the app/ bundle: build the DQX library wheel from the repo root
(`uv build ../ --wheel --out-dir ./.build`) as a bundle artifact, sync .build, and
point the runner job's serverless environment at `./.build/databricks_labs_dqx-*.whl`
(+ datacontract-cli for the [datacontract] extra, since pip extras can't be applied
to a wheel path). This matches app/databricks.yml's task-runner job, which installs
`./.build/databricks_labs_dqx-*.whl` the same way. The MCP server process needs no
change — it has no DQX dependency; only the runner job does.

Verified on the field-eng workspace: deploy built and uploaded the wheel; the runner
job env resolved to the uploaded artifact wheel (not the PyPI package); a runner job
ran successfully on it (list_available_checks returned 76 functions).

Co-authored-by: Isaac
The MCP integration test needs a deployed Databricks App, which the library's
acceptance/anomaly harness (ephemeral workspace + pytest via the sandbox/acceptance
action) does not provision. Add a dedicated workflow that deploys an isolated copy of
the MCP bundle to a persistent CI workspace, runs the test suite against it, then tears
it down.

- databricks.yml: parameterize resource names/catalog/secret-scope via bundle variables
  (name_prefix, catalog_name, config_secret_scope) with defaults preserving current
  behaviour, so CI can deploy an isolated copy via --var without forking the file.
- scripts/ci_deploy.sh: deploy bundle (builds the in-repo DQX wheel) + run setup +
  start/deploy the app, emit the app URL. Verified end-to-end against a live workspace.
- scripts/ci_destroy.sh: best-effort teardown (bundle destroy + drop CI secret scope).
- .github/workflows/mcp-integration.yml: not-a-fork gate -> deploy -> make mcp-test
  (runs the live agent integration test) -> teardown. Requires a maintainer to set the
  CI workspace secrets/vars documented in the workflow header (DQX_MCP_CI_HOST/TOKEN,
  DQX_MCP_CI_CATALOG) and to pin the setup-cli action SHA.

Co-authored-by: Isaac
The integration workflow's job now skips cleanly unless vars.DQX_MCP_CI_CATALOG is
set, so an unconfigured CI environment no longer red-blocks every mcp-server PR
(the deterministic MCP tests still run in push.yml's `mcp` job). ci_deploy.sh now
fails fast naming the exact secret/var to set (DQX_MCP_CI_HOST / DQX_MCP_CI_TOKEN /
DQX_MCP_CI_CATALOG) when run without them.

Co-authored-by: Isaac
…space/auth)

Rework the MCP integration test to follow the anomaly suite instead of a bespoke
CI workspace + token:

- tests/integration_mcp/: a suite driven by the shared acceptance harness, reusing
  its workspace, authentication, fixed TEST_CATALOG, make_schema (create+teardown),
  and Model Serving endpoint (DQX_AI_QUERY_TEST_ENDPOINT, default
  databricks-claude-sonnet-4-5). A session-scoped `deployed_mcp` fixture deploys ONE
  isolated MCP app for the whole session and tears it down (cleanup runs on deploy
  failure too); the agent test creates its table via SQL (no Spark dependency).
- .github/workflows/mcp.yml: mirrors anomaly.yml exactly (not-a-fork gate, tool env,
  setup-env, prebuild-wheel, databrickslabs/sandbox/acceptance with vault/OIDC,
  codegen_path tests/integration_mcp/.codegen.json) — no MCP-specific secrets.
- Remove the bespoke mcp-integration.yml (new workspace + DQX_MCP_CI_* secrets) and the
  now-moved mcp-server/tests/test_integration_agent.py + its pytest marker.
- ci_deploy.sh: wait for compute_status==ACTIVE before `apps deploy` (a fresh app's
  app_status can't be RUNNING pre-deploy), and use if-blocks for the GITHUB_OUTPUT/ENV
  writes so the script exits 0 on success when run locally.

Verified end-to-end against the field-eng workspace (dqx-mcp): deploy -> agent loop
(model selected get_table_schema and reported the real columns) -> teardown, 1 passed.
Catalog is overridable via DQX_MCP_TEST_CATALOG for local runs.

Co-authored-by: Isaac
@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

❌ 1 skipped, 4m1s total

Running from mcp #14

vb-dbrks added 9 commits June 24, 2026 23:34
…RICKS_TOKEN

The MCP integration suite shelled out to ci_deploy.sh and made raw HTTP calls
(serving endpoint + the app's /mcp host) that read DATABRICKS_HOST/TOKEN straight
from the environment. The acceptance harness only authenticates the SDK client and
never exports a token, so CI failed with "DATABRICKS_TOKEN is not set".

Add a session-scoped workspace_auth fixture that mints (host, bearer) via the SDK's
config.authenticate(), working under the acceptance action's OIDC auth, a local
profile, or env vars alike. Thread host/token into the deploy script env and the
test's HTTP helpers so no DATABRICKS_TOKEN needs to be set anywhere.

Verified end-to-end against a live workspace: deploy -> app start/deploy -> agent
discovers and invokes get_table_schema -> teardown, no leaked resources.

Co-authored-by: Isaac
ci_deploy.sh shells out to `databricks bundle deploy` (the MCP suite deploys a
Databricks Asset Bundle), but the acceptance run environment has no CLI, so the
deploy failed with `databricks: command not found` and the test skipped.

Add the same `databricks/setup-cli` step the e2e DAB tests in acceptance.yml use,
pinned to the repo's existing v0.297.2 SHA, before the acceptance action. The CLI
lands on PATH (GITHUB_PATH) for the acceptance action's pytest subprocess.

Co-authored-by: Isaac
The deploy fixture only put stderr[-1000:] in the skip reason, which captured a
benign trailing warning while the real bundle-deploy error (on stdout) was lost —
and the hardcoded "workspace may not support Databricks Apps" guess was misleading.

Write the full stdout+stderr of ci_deploy.sh to the log and include both tails in
a pytest.fail message (the channel the acceptance harness actually prints), so the
real failure is visible instead of a truncated warning.

Co-authored-by: Isaac
`bundle deploy` runs `terraform init`, which pulls the databricks/databricks
provider from registry.terraform.io. That fetch is intermittently reset ("EOF")
on CI runners — the repo's own e2e demo-bundle test hits the same flakiness and
survives only because the acceptance harness retries flaky tests. Our deploy runs
in a session-scoped fixture, which the harness does not retry, so a single blip
fails the whole suite.

Wrap the deploy in a bounded retry (3 attempts, 15s backoff) and add --force-lock
(as the e2e demo does) so a retried attempt isn't blocked by a stale deployment lock.

Co-authored-by: Isaac
Direct access to registry.terraform.io is blocked org-wide (supply-chain
hardening), so `bundle deploy` -> `terraform init` couldn't fetch the
databricks/databricks provider and failed with EOF on every attempt.

Write a ~/.terraformrc that redirects registry service discovery to the
sanctioned proxy (terraform-proxy.cloud.databricks.com) before the acceptance
run. Downloads still go direct; no provider pinning needed. See
go/terraform-registry-access.

Co-authored-by: Isaac
The MCP server guide listed only 8 tools and described behavior that no longer
matches the code. Updates:

- Add the 4 operationalization tools (generate_rules_from_contract, save_checks,
  load_checks, apply_checks_and_save_to_table) to the architecture diagram and
  tools table.
- Correct the "get_run_result polls internally for up to 90s" claim (3 places):
  it now does a single non-blocking poll; clients pace their own polling.
- Correct the temp-view cleanup description: the runner job drops its own view
  (sweeper backstop), not get_run_result on retrieval.
- Document that save_checks / apply_checks_and_save_to_table write as the app SP
  and need write grants on the target schema (with the GRANT statements).
- Add an example-prompt table (one per tool), a realistic end-to-end prompt, and
  a "try it with sample data" snippet. Explain load_checks is for cross-session
  reuse (the server is stateless; loaded checks live in the agent's context).

Co-authored-by: Isaac
…make target

Adds tests/integration_mcp/test_mcp_tools.py — protocol-level tests that call
every DQX MCP tool against the deployed app and assert concrete results
(discovery, get_workflow, list_available_checks, validate_checks valid+invalid,
get_table_schema, profile->generate_rules, generate_rules_from_contract,
run_checks, save->load round-trip, apply_checks_and_save_to_table). Uses explicit
error-level rules against a seeded dirty 'customers' table so failure counts are
exact (4 invalid / 6 valid of 10).

conftest:
- McpClient helper that drives tools and resolves the async submit->poll pattern.
- Session-scoped demo_data fixture: seeds the dirty table, uploads the ODCS
  contract to a UC volume the app SP can read, and grants the app SP write access
  on its throwaway schema (save_checks / apply_checks_and_save_to_table run as the
  SP). All dropped on teardown.
- deployed_mcp now exposes the app SP (ci_deploy.sh emits it) so demo_data can grant.
- test_mcp_agent reuses the shared MCP helpers (DRY).

Adds `make mcp-integration` to run the suite locally/CI like `make anomaly`
(serial — the suite deploys one shared app per session).

Verified end-to-end against a live workspace: 11 passed, no leaked resources.

Co-authored-by: Isaac
`databricks bundle deploy` ran `terraform init`, which downloads the
databricks/databricks provider from registry.terraform.io — blocked org-wide
(supply-chain hardening) and unreachable from the CI runners, so the MCP
integration deploy failed there.

Switch the bundle to the direct deployment engine (engine: direct), which deploys
without Terraform and never contacts the registry. Requires Databricks CLI
0.279.0+ (the pinned setup-cli v0.297.2 and local v1.1.0 both satisfy this).

Remove the now-unnecessary workarounds: the ~/.terraformrc registry-proxy step in
mcp.yml and the retry/--force-lock around bundle deploy in ci_deploy.sh (both only
existed to survive the flaky provider fetch). Bump the docs CLI prerequisite.

Verified end-to-end against a live workspace: 11 passed, no leaked resources.

Co-authored-by: Isaac
The acceptance harness runs each test in its own pytest session (xdist workers +
per-test retries), so the session-scoped deploy fixture was re-run per test —
deploying the app many times and colliding on the shared bundle state path
(/Workspace/.../.bundle/mcp-dqx/dev). That caused the "MCP app deploy failed"
errors; the one deploy that survived then 401'd because /mcp was hit before the
app finished starting.

Following the repo's e2e bundle test (test_run_dqx_demo_asset_bundle), replace the
two fixture-shared files with ONE self-contained test that owns its deploy +
teardown via context managers (deploy_mcp_app, seed_demo_data) and walks all 12
tools inline. One test == one deploy: no collisions, no N-deploys.

Also:
- wait_until_ready() retries tools/list through the app's startup window (the 401
  was a not-yet-serving race; verified a service principal can reach /mcp AND drive
  OBO tools, so it was never an SP/auth problem).
- Mint the bearer fresh per HTTP call and per deploy/teardown (get_token via
  config.authenticate()) so the single ~15-min test never reuses an expired token
  (a static bearer expired mid-run locally).

Co-authored-by: Isaac
result["table_name"] = params["table_name"]
except Exception as e:
logger.error(f"Operation '{operation}' failed: {e}", exc_info=True)
result = {"error": f"{type(e).__name__}: {str(e)}"}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operation failures are reported as completed, not failed. When an operation raises, this except sets result = {"error": ...} and the notebook still exits normally, so the job's result_state is SUCCESS. get_run_status then returns {"status": "completed", "result": {"error": ...}} — so an agent driving the workflow treats a failed op as success and proceeds (e.g. feeds the error dict into generate_rules). The documented status: failed + error contract is never hit for in-operation errors. Consider raise (or re-raise after logging) so the run fails, or surface a distinct failure status the client can branch on.

columns = [
{"name": row["col_name"], "type": row["data_type"], "comment": row.get("comment", "")}
for row in rows
if row.get("col_name") and not row["col_name"].startswith("#")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partitioned tables yield duplicate columns. This filter only drops rows whose col_name starts with #. For a partitioned table, DESCRIBE TABLE re-lists each partition column under a # Partition Information section without a # prefix, so every partition column ends up in columns twice. Either stop reading at the first # Partition Information/blank separator row, or de-duplicate by col_name.

Comment thread mcp-server/server/app.py
allow_origin_regex=CORS_ALLOWED_ORIGIN_REGEX,
allow_methods=["*"],
allow_headers=["*"],
allow_credentials=True,

@mwojtyczka mwojtyczka Jun 26, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Credentialed CORS over a multi-tenant wildcard. The earlier allow_origins=["*"] fix is good, but the regex matches any *.databricksapps.com host, and that domain is multi-tenant — every customer's app lives under it. With allow_credentials=True + allow_methods=["*"] + allow_headers=["*"], any other tenant's app page is an allowed credentialed origin and could replay JSON-RPC tool calls if the Apps proxy forwards the OBO token cross-origin.

Is it possible to narrow the regex to the specific workspace/app host(s) rather than the whole databricksapps.com domain? (and confirm whether the proxy actually attaches the token on cross-origin requests).

Comment thread mcp-server/notebooks/runner.py
vb-dbrks added 8 commits June 26, 2026 18:56
Bring the MCP server guide to parity with the DQX Studio docs and document the
client setup the app needs (Genie Code / Cursor / Claude Code):

- Architecture as a Mermaid sequence diagram (OBO auth + async submit/poll flow).
- Prerequisites: workspace-features + deploying-user permissions tables.
- Deploy: "what gets deployed", a one-command `make mcp-deploy` path (+ step-by-step),
  and the corrected app-start step (`bundle run mcp-dqx`).
- Configuration: full bundle-variable reference + monitoring commands.
- Connect an MCP client: Genie Code (recommended) with screenshots, deferring to the
  Databricks "Add MCP servers to Genie Code" doc; Cursor / Claude Code point to their
  own MCP docs. OAuth-only auth note (PATs are rejected by the OBO front-door).
- Approving tool actions: Ask-first vs Auto-approve, why Auto-approve suits the
  get_run_result polling (links the Databricks "Approve tool actions" doc).
- Troubleshooting: enabling/refreshing tools, and "repeated get_run_result calls are
  expected", both with screenshots.
- Upgrade / uninstall sections.
- Update branding Mosaic AI Agent -> Agent Bricks.

Adds `make mcp-deploy PROFILE=<p>` (bundle deploy -> setup job -> deploy app), mirroring
`make app-deploy`.

Co-authored-by: Isaac
Grant-on-write: save_checks and apply_checks_and_save_to_table run as the app
service principal, so tables they create are SP-owned and invisible to the caller.
The runner now grants the calling user (X-Forwarded-Email) ALL PRIVILEGES + MANAGE
on the tables it creates, best-effort, so the user can read, modify, and drop the
outputs outside the MCP. Ownership stays with the SP so repeat overwrite runs keep
working and don't depend on a revocable grant. Tools forward the caller email via
get_user_email(); the runner validates principal + table identifiers before the
GRANT and echoes access_granted_to / granted_tables in the result.

Logging: a single idempotent configure_logging() (used by both main.py and app.py)
replaces the split config, honors DQX_MCP_LOG_LEVEL, and quiets noisy third-party
loggers. A RequestContextFilter stamps every line with a per-request correlation id
and the calling user, so a request can be traced across handler, SQL, and job submit
in the Databricks Apps log stream. OBOAuthMiddleware now sets the request id (honoring
inbound X-Request-Id), logs one line per request with status + duration, and resets
all request context on the way out. Warehouse-selection logs dropped to DEBUG.

Docs and unit/integration tests updated to cover both.

Co-authored-by: Isaac
Add a brand-red (#ff3621, matching the DQX logo) Beta badge next to the DQX MCP
Server title, defined as a reusable .badge--beta Infima variant. The badge links
to a new Reference > Feature lifecycle page that explains the status badges DQX
uses (Experimental, Beta, GA, Deprecated) and what each means for stability and
support. Explicit frontmatter title/sidebar_label keep the sidebar, breadcrumb,
and browser tab free of the heading's inline JSX.

Co-authored-by: Isaac
…root

The bundle artifact build runs `uv build ../` from inside mcp-server/, so the
Makefile's global relative UV_BUILD_CONSTRAINT (.build-constraints.txt) resolved
against the wrong directory and the wheel build failed with "File not found".
Pin it to the absolute repo-root path for the mcp-deploy target.

Co-authored-by: Isaac
`make fmt` runs update_github_urls.py, which pins databrickslabs/dqx GitHub
links in docs to the latest release tag. Apply it to the new Feature lifecycle
page so the formatting check produces no diff.

Co-authored-by: Isaac
The mcp workflow already wrote a .coveragerc and set COVERAGE_FILE but never
combined or published the result. Add the combine + codecov-action steps the
other integration jobs use (flag: mcp), so MCP integration/e2e coverage is
reported like the rest of the suite.

Co-authored-by: Isaac
The Databricks Apps front-door only accepts OAuth tokens (user U2M or SP
M2M-with-secret), not the acceptance harness's metadata-service token — so the
deployed app's /mcp endpoint returns 401 in CI even though the identity owns the
app (a token-type limitation, not permissions). Other suites are token-free only
because they never leave the SDK/control plane.

Add an app_auth fixture that mints an OAuth M2M bearer from DQX_MCP_APP_CLIENT_ID
/ DQX_MCP_APP_CLIENT_SECRET (an SP with CAN_USE on the app, provisioned in the
acceptance vault) for the /mcp calls, while deploy + Model Serving keep using the
control-plane bearer. MCP-specific env names avoid changing other suites' SDK
auth. Falls back to ambient auth (OAuth profile) locally. Until the secret is
provisioned, wait_until_ready turns the 401 into a clear skip rather than a
180s hard failure.

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

under-review This PR is currently being reviewed by one of DQX maintainers.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE]: Implement MCP Server for DQX

3 participants