Skip to content

fix(sqlalchemy): do not emit CREATE SCHEMA on Oracle (#3939)#4001

Open
DRACULA1729 wants to merge 1 commit into
dlt-hub:develfrom
DRACULA1729:fix/oracle-sqlalchemy-create-schema
Open

fix(sqlalchemy): do not emit CREATE SCHEMA on Oracle (#3939)#4001
DRACULA1729 wants to merge 1 commit into
dlt-hub:develfrom
DRACULA1729:fix/oracle-sqlalchemy-create-schema

Conversation

@DRACULA1729
Copy link
Copy Markdown

Description

Loading to Oracle through the sqlalchemy destination fails on the first run with ORA-02420: missing schema authorization clause. dlt runs CREATE SCHEMA "<dataset>" when it cannot find the target dataset, and Oracle rejects that. In Oracle a schema is a database user, so there is no bare CREATE SCHEMA (it needs CREATE SCHEMA AUTHORIZATION <user> ...).

This adds dataset-lifecycle extension points to DialectCapabilities and routes the SQL client through them. The base implementations keep the old behavior for every other dialect, so nothing changes for Postgres/MySQL/etc.

OracleDialectCapabilities overrides them:

  • dataset_exists matches case-insensitively, since Oracle folds unquoted identifiers to upper case. Loading into an existing schema is now detected and no creation is attempted.
  • create_dataset no longer emits CREATE SCHEMA. If the schema is missing it raises a terminal error explaining that the schema (and the <dataset>_staging schema, for merge/replace) has to be created up front as a user with grants, instead of surfacing the cryptic ORA-02420.
  • drop_dataset drops the tables inside the schema instead of DROP SCHEMA, which on Oracle would require DROP USER (a DBA privilege).

Related Issues

Additional Context

  • Follows the maintainer guidance in Pipeline state tables are not created when using Oracle via SQLAlchemy as a destination #3141: staging features are not hard-restricted; if the user pre-creates the <dataset>_staging schema, merge/replace still work.
  • Oracle is not in the sqlalchemy-destination CI matrix, so the new tests are unit-level (mock-based) in tests/load/sqlalchemy/test_sqlalchemy_dialect.py and run without a live Oracle.
  • Docs updated: the Oracle limitations section and the dialect-capabilities extension-points table in sqlalchemy.md.
  • Ran locally: the new dialect tests pass; ruff, mypy, flake8 and black are clean on the changed files.

Oracle schemas are owned by database users and cannot be created with a
bare `CREATE SCHEMA` statement, so the sqlalchemy destination failed with
`ORA-02420: missing schema authorization clause` when initializing storage.

Add dataset-lifecycle extension points (`dataset_exists`, `create_dataset`,
`drop_dataset`) to `DialectCapabilities` and delegate to them from the SQL
client. The base implementations preserve the previous behavior for all
other dialects. `OracleDialectCapabilities` overrides them to:

- match schema existence case-insensitively (Oracle folds identifiers to
  upper case), so loading into an existing schema is detected and no
  creation is attempted
- skip `CREATE SCHEMA`; raise a clear terminal error when the target schema
  (or `<dataset>_staging`) does not exist, instead of the cryptic ORA-02420
- drop the tables within the schema instead of `DROP SCHEMA` (which would
  require `DROP USER`, a DBA privilege)

Add unit tests for the new hooks and document Oracle's existing-schema
requirement in the destination docs.
@DRACULA1729 DRACULA1729 force-pushed the fix/oracle-sqlalchemy-create-schema branch from ef3ee3a to 31bcc80 Compare May 29, 2026 21:30
@DRACULA1729
Copy link
Copy Markdown
Author

DRACULA1729 commented May 29, 2026

This is ready for review. The fork-gated workflows (the ones that need repo secrets) are stuck in the "requires reviewer approval" state, so they need a maintainer to approve the run before they execute.

@ivasio you dug into this exact Oracle case in #3141, so it's probably familiar territory. @rudolfix it extends the DialectCapabilities system from #3600 with dataset-lifecycle hooks. Happy to rebase or adjust anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Oracle 12c ORA-02420: missing schema authorization clause

1 participant