Skip to content

Fix/entity name validation v2#27886

Open
jaya6400 wants to merge 13 commits intoopen-metadata:mainfrom
jaya6400:fix/entity-name-validation-v2
Open

Fix/entity name validation v2#27886
jaya6400 wants to merge 13 commits intoopen-metadata:mainfrom
jaya6400:fix/entity-name-validation-v2

Conversation

@jaya6400
Copy link
Copy Markdown

@jaya6400 jaya6400 commented May 4, 2026

Describe your changes:

Fixes #23268

I updated the regex patterns in JSON schema definitions to block reserved
FQN separator characters (::, >, ") and newlines (\n) from being
used in entity and column names.

The backend was accepting entity names with special characters like double
quotes and newlines, which caused issues in FQN construction since these
are reserved separator characters in OpenMetadata's FQN system.

Changes made:

  • openmetadata-spec/src/main/resources/json/schema/type/basic.json — updated pattern for entityName and testCaseEntityName
  • openmetadata-spec/src/main/resources/json/schema/entity/data/table.json — updated pattern for columnName

Pattern changed from ^((?!::).)*$ to ^((?!::)[^><\"|\\x00-\\x1f])*$

Unit Test: #27521 (comment)

Type of change:

  • Bug fix

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes #23268: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Migration scripts are not needed for this change as it only tightens input validation at the schema level. Existing data is not affected — only new entity creation requests will be validated against the updated pattern.


Summary by Gitar

  • CSV Import logic updates:
    • Improved EntityCsv by fixing row count tracking logic and introducing rowEntityType to handle entity types dynamically.
    • Added validation for custom properties using effectiveEntityType to ensure schema consistency during CSV import.
  • Infrastructure enhancements:
    • Updated TestSuiteBootstrap to support Redis caching in integration tests and increased memory buffers (sort_buffer_size, work_mem) for database containers to resolve sort-related failures.
  • UI Constants and cleanup:
    • Removed unused CUSTOM_PROPERTY_NAME_REGEX from regex.constants.ts and added new block regex patterns for sections and admonitions.

This will update automatically on new commits.

jaya6400 added 11 commits May 4, 2026 17:24
…names

Fixes open-metadata#23268

Updated regex patterns in JSON schema definitions to block reserved
FQN separator characters (::, >, ") and newlines in entity names.

Files changed:
- openmetadata-spec/.../type/basic.json: entityName, testCaseEntityName
- openmetadata-spec/.../entity/data/table.json: columnName
Updated pattern to use \x00-\x1f range to block all control characters
including \r, \n, \t, \0 in addition to reserved FQN characters.
This also restores the \r restriction that was present in the original
dot metacharacter but lost in the character class rewrite.
…hema fields and UI

- Added < and | to blocked characters (consistent with entityLink pattern)
- Updated searchIndexFieldName pattern in searchIndex.json
- Added pattern validation to tableConstraint columns and partitionColumnDetails in table.json
- Updated UI ENTITY_NAME_REGEX to match backend validation
- Added unit tests for new blocked characters in regex.constants.test.ts

Addresses review feedback on open-metadata#27521
Addresses inconsistency flagged in code review - columnName definition
was missing < and | from blocked characters unlike other patterns.
…erved characters

Added Java integration tests to verify that entity names containing
reserved FQN characters (::, >, <, ", |) and control characters are
rejected by the backend validation.

Files changed:
- TableResourceIT.java: added column name validation tests
- TopicResourceIT.java: added schema field name validation tests
- PipelineResourceIT.java: added task name validation tests
- SearchIndexResourceIT.java: added search index field name validation tests
- TestSuiteBootstrap.java: updated bootstrap for test suite
- pipeline.json: updated pattern for task names
- schema.json: updated pattern for schema field names
…less formatting

- Added minLength: 1 to pipeline task name to prevent empty strings
- Applied mvn spotless:apply formatting fixes
@jaya6400 jaya6400 requested a review from a team as a code owner May 4, 2026 12:09
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@jaya6400
Copy link
Copy Markdown
Author

jaya6400 commented May 4, 2026

Hi @aniketkatkar97,

Created this new PR to replace #27521, the previous one had unintended files from a mvn spotless:apply run. Cherry-picked only the relevant commits for a clean diff (15 files).

Updates since last review:

  • Fixed ValidatorUtilTest.java assertion to match updated regex pattern
  • Fixed EntityCsv.java contains() string to match actual validator output
  • Added missing | pipe character test case in regex.constants.test.ts

Could a maintainer add the safe to test label? @harshach @sonika-shah @Rohit0301 @anuj-kumary

Thanks!

@aniketkatkar97 aniketkatkar97 added the safe to test Add this label to run secure Github workflows on PRs label May 5, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented May 5, 2026

Code Review 👍 Approved with suggestions 1 resolved / 2 findings

Tightens entity and column name validation regex to reject reserved FQN characters and control codes, resolving the missing pipe character test coverage. Consider whether silently skipping empty extension values in CSV imports might mask user configuration errors.

💡 Edge Case: Silently skipping empty extension values may hide user errors

📄 openmetadata-service/src/main/java/org/openmetadata/csv/EntityCsv.java:613-615

The change at line 613-615 converts an empty extension value from an error (previously reported via deferredFailure) to a silent skip (continue). While this may be intentional to allow partial extension definitions, it means users who accidentally leave a value blank in their CSV will receive no feedback that data was dropped. Consider at minimum logging a warning, or documenting this as intentional lenient behavior.

Suggested fix
if (value.isEmpty()) {
  LOG.debug("Skipping empty extension value for key '{}' in record {}", key, csvRecord.getRecordNumber());
  continue;
}
✅ 1 resolved
Quality: Missing test coverage for pipe | character rejection

📄 openmetadata-ui/src/main/resources/ui/src/constants/regex.constants.test.ts:162-170
The updated regex blocks | (pipe) along with >, <, ", and control characters, but the frontend test suite (regex.constants.test.ts) does not include a test case for |. All other newly-blocked characters have corresponding test assertions. Adding this ensures the full intent of the regex is validated.

Also worth noting: according to the ANTLR grammar in Fqn.g4, the actual FQN-reserved characters are . and " (with \ as escape). Characters >, <, and | are not FQN separators per the grammar. The PR description calls them "reserved FQN separator characters" — consider updating documentation/comments to clarify these are blocked for safety/consistency rather than being FQN separators.

🤖 Prompt for agents
Code Review: Tightens entity and column name validation regex to reject reserved FQN characters and control codes, resolving the missing pipe character test coverage. Consider whether silently skipping empty extension values in CSV imports might mask user configuration errors.

1. 💡 Edge Case: Silently skipping empty extension values may hide user errors
   Files: openmetadata-service/src/main/java/org/openmetadata/csv/EntityCsv.java:613-615

   The change at line 613-615 converts an empty extension value from an error (previously reported via `deferredFailure`) to a silent skip (`continue`). While this may be intentional to allow partial extension definitions, it means users who accidentally leave a value blank in their CSV will receive no feedback that data was dropped. Consider at minimum logging a warning, or documenting this as intentional lenient behavior.

   Suggested fix:
   if (value.isEmpty()) {
     LOG.debug("Skipping empty extension value for key '{}' in record {}", key, csvRecord.getRecordNumber());
     continue;
   }

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 62%
62.42% (62939/100817) 42.75% (33918/79324) 45.75% (10035/21934)

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 5, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Backend: Validate if the entity names are valid in terms of Name, Fqn

2 participants