Skip to content

fix(csharp): expand SEA catalog wildcards in GetSchemas client-side (PECO-3035) [SEA]#472

Open
eric-wang-1990 wants to merge 4 commits into
mainfrom
fix/csharp/PECO-3035-sea-catalog-wildcard
Open

fix(csharp): expand SEA catalog wildcards in GetSchemas client-side (PECO-3035) [SEA]#472
eric-wang-1990 wants to merge 4 commits into
mainfrom
fix/csharp/PECO-3035-sea-catalog-wildcard

Conversation

@eric-wang-1990

Copy link
Copy Markdown
Collaborator

What's Changed

JDBC metadata APIs (e.g. getSchemas(catalog, schemaPattern)) accept SQL LIKE-style patterns in the catalog argument (%, _, \_), and the Thrift driver honors those patterns server-side. The SEA path wrapped the catalog in backticks as a literal identifier, so:

  • catalog="%" Thrift returns all 7 catalogs; SEA looked up a catalog literally named %, found nothing, returned empty.
  • catalog="comp%" Thrift prefix-matches; SEA returned empty.
  • catalog="my_table" (unescaped _) Thrift wildcards match; SEA returned empty.

This PR adds client-side wildcard expansion in StatementExecutionConnection.ListSchemasAsync:

Catalog argument New behaviour
null or pure % / * SHOW SCHEMAS IN ALL CATALOGS (one round-trip)
Pattern with unescaped % or _ (e.g. comp%) SHOW CATALOGS LIKE '<pat>' to enumerate matches, then per-catalog SHOW SCHEMAS IN \`` aggregated together
Literal name Unchanged (single SHOW SCHEMAS IN \``)

Wildcard detection lives in MetadataCommandBase and honors backslash escapes (so \_ and \% stay literal, matching JDBC escape clause semantics).

Both call sites now share the helper:

  • IGetObjectsDataProvider.GetSchemasAsync used by connection.GetObjects(...)
  • StatementExecutionStatement.GetSchemasAsync used by the direct GetSchemas metadata command

Why

PECO-3035 (D7 in the SEA rollout). Catalog wildcard handling was the only SEA-specific gap in GetSchemas; once this lands, SEA and Thrift agree on catalog wildcard semantics for GetSchemas / GetObjects(depth=DbSchemas). GetCatalogs already used SHOW CATALOGS LIKE correctly; this PR brings GetSchemas to parity.

Red -> Green proof

E2E test SEA_GetObjects_CatalogPercentWildcard_ReturnsSchemasFromAllCatalogs in csharp/test/E2E/StatementExecution/SeaMetadataE2ETests.cs issues GetObjects(depth=DbSchemas, catalogPattern="%") against the live pecotesting warehouse and asserts the schema count matches the catalogPattern=null baseline.

Before fix:

Failed AdbcDrivers.Databricks.Tests.E2E.StatementExecution.SeaMetadataE2ETests.SEA_GetObjects_CatalogPercentWildcard_ReturnsSchemasFromAllCatalogs [5 s]
  Error Message:
   Assert.Equal() Failure: Values differ
Expected: 3938
Actual:   0

After fix:

Passed!  - Failed:     0, Passed:     1, Skipped:     0, Total:     1, Duration: 8 s

Full class regression check (all 19 SeaMetadataE2ETests):

Passed!  - Failed:     0, Passed:    19, Skipped:     0, Total:    19, Duration: 33 s

Files touched

  • csharp/src/StatementExecution/MetadataCommands/MetadataCommandBase.cs (added ContainsUnescapedWildcard + IsMatchAnything helpers)
  • csharp/src/StatementExecution/StatementExecutionConnection.cs (added ListSchemasAsync + private ExecuteShowSchemasAsync helper; IGetObjectsDataProvider.GetSchemasAsync now delegates)
  • csharp/src/StatementExecution/StatementExecutionStatement.cs (refactored GetSchemasAsync to delegate to connection.ListSchemasAsync)
  • csharp/test/E2E/StatementExecution/SeaMetadataE2ETests.cs (new test + helper)

Manual verification

  • dotnet build succeeds for netstandard2.0 and net8.0
  • New E2E test passes against pecotesting
  • Full SeaMetadataE2ETests class passes (no regressions)
  • Adjacent unit tests pass (ShowCommandTests, StatementExecutionMetadataObjectNotFoundTests)
  • pre-commit run --all-files (skipped locally: pre-commit environment couldn't install setuptools; will run in CI)

PECO-3035

…PECO-3035) [SEA]

JDBC metadata APIs (getSchemas(catalog, schemaPattern)) accept SQL LIKE-style
patterns in the catalog argument (%, _, \_), and Thrift honors those patterns
server-side. The SEA path wrapped the catalog in backticks as a literal
identifier, so SHOW SCHEMAS IN `%` returned empty rather than expanding to
all catalogs.

This change adds client-side wildcard expansion in StatementExecutionConnection:
  - null or pure '%' / '*' -> SHOW SCHEMAS IN ALL CATALOGS
  - pattern with unescaped %/_ -> SHOW CATALOGS LIKE '<pat>' then per-catalog
    SHOW SCHEMAS IN `<cat>` aggregated together
  - literal name -> unchanged (single SHOW SCHEMAS IN `<catalog>`)

Both the IGetObjectsDataProvider.GetSchemasAsync path (used by GetObjects)
and StatementExecutionStatement.GetSchemasAsync (direct GetSchemas metadata
command) now share a single ListSchemasAsync helper. Wildcard detection
honors backslash escapes (\_ stays literal).

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a SEA-specific gap in JDBC/Thrift-compatible metadata behavior by expanding SQL LIKE-style wildcards in the catalog argument for GetSchemas/GetObjects(depth=DbSchemas) on the client side (PECO-3035), bringing SEA results in line with Thrift.

Changes:

  • Added wildcard detection helpers (ContainsUnescapedWildcard, IsMatchAnything) to recognize unescaped %/_ (with backslash escapes).
  • Implemented StatementExecutionConnection.ListSchemasAsync to resolve catalog wildcards client-side (fast-path to IN ALL CATALOGS, otherwise SHOW CATALOGS LIKE ... + per-catalog SHOW SCHEMAS).
  • Refactored SEA GetSchemas call sites to share the new logic and added an E2E regression test validating % matches the null baseline.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
csharp/src/StatementExecution/MetadataCommands/MetadataCommandBase.cs Adds wildcard-detection helpers for JDBC-like patterns with backslash escaping.
csharp/src/StatementExecution/StatementExecutionConnection.cs Introduces ListSchemasAsync + ExecuteShowSchemasAsync and routes GetSchemasAsync through it.
csharp/src/StatementExecution/StatementExecutionStatement.cs Refactors GetSchemasAsync to delegate to connection.ListSchemasAsync.
csharp/test/E2E/StatementExecution/SeaMetadataE2ETests.cs Adds E2E regression coverage for catalogPattern="%" in GetObjects(depth=DbSchemas).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread csharp/src/StatementExecution/StatementExecutionConnection.cs
Comment thread csharp/src/StatementExecution/StatementExecutionConnection.cs Outdated
eric-wang-1990 and others added 2 commits May 29, 2026 00:20
…ecoding

Addresses Copilot review comments on PR #472:

- Add Theory-driven unit tests for MetadataCommandBase.ContainsUnescapedWildcard
  (21 cases) and IsMatchAnything (7 cases). Covers the backslash-escape edge cases
  that drive client-side wildcard expansion in ListSchemasAsync.

- Decode SHOW CATALOGS results via TryGetColumn<StringArray>(batch, "catalog")
  instead of positional batch.Column(0), matching the convention in
  GetCatalogsAsync / GetTablesAsync / etc. Apply the same to ExecuteShowSchemasAsync
  for the SHOW SCHEMAS result columns (databaseName, catalog).

- Drop the "in deterministic order" claim from the ListSchemasAsync XML doc —
  the implementation preserves backend order without sorting.

Co-authored-by: Isaac
Comment thread csharp/src/StatementExecution/StatementExecutionConnection.cs Outdated
Replaces the per-catalog fan-out (SHOW CATALOGS LIKE 'pat' + N x SHOW
SCHEMAS IN `cat`) with a single SHOW SCHEMAS IN ALL CATALOGS + client-side
filtering by the catalog column. The previous design issued 1+N round-trips
and ran N separate queries on the backend; the new one is always 1
round-trip when the catalog pattern is wildcarded.

The literal-catalog path is preserved (single SHOW SCHEMAS IN `cat`) so
users who pass a specific catalog name don't pay the cost of fetching
every catalog's schemas just to throw most of them away.

Adds MetadataCommandBase.JdbcLikeToRegex to compile a JDBC LIKE pattern
(% / _ / \% / \_ / \\) into an anchored .NET Regex, mirroring the same
escape semantics as ContainsUnescapedWildcard. Covered by 23 new
[Theory] cases in ShowCommandTests.

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants