Skip to content

fix: validate custom property name charset#27808

Open
sonika-shah wants to merge 15 commits intomainfrom
fix/custom-property-name-charset-validation
Open

fix: validate custom property name charset#27808
sonika-shah wants to merge 15 commits intomainfrom
fix/custom-property-name-charset-validation

Conversation

@sonika-shah
Copy link
Copy Markdown
Collaborator

@sonika-shah sonika-shah commented Apr 29, 2026

Fixes #27699

Summary

Tighten custom property name validation to block characters that empirically break downstream parsers. Adds a new customPropertyName schema definition, defense-in-depth Java validation in TypeRepository, a schema-vs-Java consistency test that prevents drift between the two, and integration tests covering allowed/blocked charsets, leading-character rules, length, and unbalanced brackets.

Why

Custom property names previously inherited the very permissive entityName pattern (^((?!::).)*$), which only blocks ::. A matrix run on 1.13 across 24 special chars × 17 property types × 4 operations identified five characters that break specific operations regardless of property type:

char breaks how
" CREATE HTTP 500 from PUT /metadata/types/{id}
: CSV import exporter writes key:value;key:value; importer splits at first :, treats prefix as the field name (Unknown custom field: <prefix>)
^ Search when the name is in searchSettings.searchFields, OpenSearch reads ^ as the boost separator in the field^boost grammar (x_content_parse_exception [bool] failed to parse field [should])
$ CSV import Java Matcher.replaceAll interprets $<letter> as a regex backreference (Illegal group reference) when the value contains certain characters
\ CSV import escape inconsistency in CSV serialization for entityReference / entityReferenceList types (export emits double-backslash, import un-escapes to single, key lookup fails)

All other tested characters (alphanumeric, _, -, ., /, &, %, #, @, !, ,, ;, =, |, ', +, ?, *, ~, `, space, (, ), <, >, [, ], {, }) round-trip cleanly across CREATE / PATCH / CSV (real, non-dry-run) / Search across all 17 property types, including multi-property CSV roundtrips that exercise the ; and , CSV-quoting layer.

Final charsets

  • Allow: A-Z a-z 0-9 _ - . / & % # @ ! , ; = | ' + ? * ~ space ( ) < > [ ] { }`
  • Block: " : ^ $ \

Changes

  • openmetadata-spec/.../type/basic.json — adds customPropertyName definition with regex ^[A-Za-z0-9][A-Za-z0-9 _\-.,;/&%#@!'(){}\[\]<>|=+?*~]*$`, max length 256.
  • openmetadata-spec/.../type/customProperty.jsonname now $refs customPropertyName instead of entityName.
  • openmetadata-service/.../jdbi3/TypeRepository.java — adds validateCustomPropertyName(String) called from validateProperty(CustomProperty) so the API returns 400 with a clear, human-readable error message even if schema validation is bypassed (e.g. internal callers that don't go through the @Valid-protected resource layer).
  • openmetadata-service/.../jdbi3/TypeRepositoryTest.java (new) — schema-vs-Java consistency tests that load basic.json at runtime and assert the pattern and maxLength match the TypeRepository constants. Drift in either direction fails CI with a message naming both files.

Why two layers + a sync test (instead of one)

Each layer earns its keep:

  • JSON Schema is the single source of truth visible to clients, generated SDKs, and the request-time @Valid Bean Validation flow.
  • Java validator gives a much better error message than the default @Pattern violation (which dumps the raw regex), and runs even on code paths that build CustomProperty programmatically without crossing the HTTP boundary.
  • Consistency test ensures the two cannot drift — touching one without the other fails CI immediately.

Backwards compatibility

Validator runs only on writes. Existing custom properties with names containing the now-blocked characters continue to read normally — only new creates / updates are rejected.

Test plan

Unit tests in openmetadata-service (TypeRepositoryTest)

  • customPropertyNamePatternMatchesSchema — schema regex matches Java Pattern
  • customPropertyNameMaxLengthMatchesSchema — schema maxLength matches Java constant
  • customPropertyNameDefinitionIsRequiredString — definition shape sanity check

Integration tests in openmetadata-integration-tests (TypeResourceIT)

  • test_customPropertyNameAllowedCharacters_succeeds — 31 names spanning every safe character (including +, ?, *, ~, `)
  • test_customPropertyNameDisallowedCharacters_fails — verifies ", :, ^, $, \ are rejected
  • test_customPropertyNameMustStartWithAlphanumeric_fails — leading char rule for _, -, ., , (, <
  • test_customPropertyNameTooLong_fails — > 256 character names rejected
  • test_customPropertyNameUnbalancedBrackets_succeeds(, ), <, >, [, ], {, } accepted unbalanced (no closure required)
  • Existing tests untouched

🤖 Generated with Claude Code


Summary by Gitar

  • Frontend validation updates:
    • Integrated CUSTOM_PROPERTY_NAME_REGEX into AddCustomProperty components and updated regex constants.
    • Updated UI SanitizedInput and MUITextField to optionally bypass sanitization for custom property inputs.
  • Localization updates:
    • Updated custom-property-name-validation error messages across all supported languages to reflect the new charset restrictions.
  • Playwright test updates:
    • Added extensive test suite for custom property name validation and updated existing entity tests to accommodate new naming requirements.

This will update automatically on new commits.

Copilot AI review requested due to automatic review settings April 29, 2026 07:02
@github-actions github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels Apr 29, 2026
@github-actions
Copy link
Copy Markdown
Contributor

✅ TypeScript Types Auto-Updated

The generated TypeScript types have been automatically updated based on JSON schema changes in this PR.

@github-actions github-actions Bot requested a review from a team as a code owner April 29, 2026 07:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens custom property name validation to prevent a small set of characters that break downstream parsers, by enforcing a stricter JSON Schema pattern plus server-side validation and integration tests.

Changes:

  • Added a customPropertyName definition in basic.json with a restricted character set and max length.
  • Updated customProperty.json to validate name against customPropertyName instead of entityName.
  • Added Java-side validation in TypeRepository and integration tests in TypeResourceIT for allowed/disallowed names, leading character, and length rules.

Reviewed changes

Copilot reviewed 4 out of 6 changed files in this pull request and generated 1 comment.

File Description
openmetadata-spec/src/main/resources/json/schema/type/customProperty.json Switches name validation to the new customPropertyName definition and updates the field description to document the new rules.
openmetadata-spec/src/main/resources/json/schema/type/basic.json Introduces the reusable customPropertyName schema definition (pattern + length).
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TypeRepository.java Adds defense-in-depth validation for custom property names during add/update operations.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TypeResourceIT.java Adds integration coverage for allowed/disallowed character sets, leading character rules, and max length.

Comment on lines +220 to +255
@@ -228,6 +234,26 @@ private void validateProperty(CustomProperty customProperty) {
}
}

private static void validateCustomPropertyName(String name) {
if (name == null || name.isEmpty()) {
throw new IllegalArgumentException("Custom property name must not be empty");
}
if (name.length() > CUSTOM_PROPERTY_NAME_MAX_LENGTH) {
throw new IllegalArgumentException(
String.format(
"Custom property name '%s' exceeds maximum length of %d characters",
name, CUSTOM_PROPERTY_NAME_MAX_LENGTH));
}
if (!CUSTOM_PROPERTY_NAME_PATTERN.matcher(name).matches()) {
throw new IllegalArgumentException(
String.format(
"Invalid custom property name '%s'. Name must start with an alphanumeric character "
+ "and may contain only: alphanumeric, _ - . / & %% # @ ! , ; = | ' space ( ) < > [ ] { }. "
+ "The following characters are not allowed: \" : ^ $",
name));
}
}
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateCustomPropertyName is only invoked from addCustomProperty() via validateProperty(CustomProperty). However, Type supports PATCH updates (TypeResource PATCH /{id} and PATCH /name/{fqn}), and the generic patch flow only calls TypeRepository.prepare() (which currently only validates property types via TypeRegistry.validateCustomProperties) before persisting changes. This means a JSON Patch that adds/updates customProperties can still persist names containing ", :, ^, $, etc., bypassing the new defense-in-depth validation.

To fully enforce the new constraint on all write paths, also validate custom property names when Type entities are prepared/updated (e.g., iterate over type.getCustomProperties() in prepare() and call validateCustomPropertyName, or validate in TypeUpdater.updateCustomProperties() before storing added properties).

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 29, 2026

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 62%
62.45% (62923/100756) 42.77% (33915/79291) 45.77% (10035/21924)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 29, 2026

🔴 Playwright Results — 2 failure(s), 12 flaky

✅ 3982 passed · ❌ 2 failed · 🟡 12 flaky · ⏭️ 98 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 298 0 1 4
🟡 Shard 2 750 0 4 8
🔴 Shard 3 775 2 2 19
✅ Shard 4 740 0 0 18
🟡 Shard 5 685 0 2 41
🟡 Shard 6 734 0 3 8

Genuine Failures (failed on all attempts)

Pages/CustomProperties.spec.ts › table-cp shows row count, scrollable container, no expand toggle (shard 3)
�[31mTest timeout of 180000ms exceeded.�[39m
Pages/CustomProperties.spec.ts › Create custom property and configure search for Dashboard (shard 3)
�[31mTest timeout of 180000ms exceeded.�[39m
🟡 12 flaky test(s) (passed on retry)
  • Pages/Roles.spec.ts › Roles page should work properly (shard 1, 1 retry)
  • Features/ActivityAPI.spec.ts › Activity event is created when description is updated (shard 2, 1 retry)
  • Features/ActivityAPI.spec.ts › Activity event is created when tags are added (shard 2, 1 retry)
  • Features/DataQuality/TestLibrary.spec.ts › should handle supported services field correctly (shard 2, 1 retry)
  • Features/Glossary/GlossaryWorkflow.spec.ts › should start term as Draft when glossary has reviewers (shard 2, 2 retries)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 2 retries)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
  • Pages/EntityDataSteward.spec.ts › Tier Add, Update and Remove (shard 5, 1 retry)
  • Pages/ExplorePageRightPanel.spec.ts › Should allow Data Consumer to edit description for pipeline (shard 5, 1 retry)
  • Pages/Lineage/DataAssetLineage.spec.ts › Column lineage for apiEndpoint -> mlModel (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: searchIndex (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Copilot AI review requested due to automatic review settings April 29, 2026 10:23
@github-actions
Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@github-actions
Copy link
Copy Markdown
Contributor

✅ TypeScript Types Auto-Updated

The generated TypeScript types have been automatically updated based on JSON schema changes in this PR.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 7 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TypeRepository.java:228

  • validateCustomPropertyName is only invoked from validateProperty(CustomProperty), which is used by the /types/{id} add/update-property endpoint. Other write paths (notably TypeUpdater.updateCustomProperties()storeCustomProperty() during PUT/PATCH of a Type) can still persist custom properties without any name validation, allowing clients to bypass this restriction via JSON Patch/PUT. Consider validating all updated.getCustomProperties() entries (e.g., call validateCustomPropertyName/validateProperty for added items, or enforce it in prepare(Type, boolean) for every custom property) so the rule cannot be bypassed.
  private void validateProperty(CustomProperty customProperty) {
    validateCustomPropertyName(customProperty.getName());
    switch (customProperty.getPropertyType().getName()) {
      case "enum" -> validateEnumConfig(customProperty.getCustomPropertyConfig());
      case "table-cp" -> validateTableTypeConfig(customProperty.getCustomPropertyConfig());
      case "date-cp" -> validateDateFormat(

"name": {
"description": "Name of the entity property. Note a property name must be unique for an entity. Property name must follow camelCase naming adopted by openMetadata - must start with lower case with no space, underscore, or dots.",
"$ref": "../type/basic.json#/definitions/entityName"
"description": "Name of the entity property. Must be unique for an entity. Allowed characters: alphanumeric, _ - . / & % # @ ! , ; = | ' + ? * ~ ` space ( ) < > [ ] { }. Must start with an alphanumeric character. Disallowed: \" : ^ $ \\.",
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name.description text ends with \\. which renders as \. (backslash + dot) and is confusing/inaccurate (dot is allowed by the pattern). If the intent is to list \ as a disallowed character, update the description to show a literal backslash without the trailing . and keep the allowed/disallowed lists consistent with basic.json#/definitions/customPropertyName.

Suggested change
"description": "Name of the entity property. Must be unique for an entity. Allowed characters: alphanumeric, _ - . / & % # @ ! , ; = | ' + ? * ~ ` space ( ) < > [ ] { }. Must start with an alphanumeric character. Disallowed: \" : ^ $ \\.",
"description": "Name of the entity property. Must be unique for an entity. Allowed characters: alphanumeric, _ - . / & % # @ ! , ; = | ' + ? * ~ ` space ( ) < > [ ] { }. Must start with an alphanumeric character. Disallowed: \" : ^ $ \\",

Copilot uses AI. Check for mistakes.
Comment on lines +129 to +135
"customPropertyName": {
"description": "Name of a custom property. Allowed characters: alphanumeric, _ - . / & % # @ ! , ; = | ' + ? * ~ ` space ( ) < > [ ] { }. Must start with an alphanumeric character. Disallowed characters: \" : ^ $ \\",
"type": "string",
"minLength": 1,
"maxLength": 256,
"pattern": "^[A-Za-z0-9][A-Za-z0-9 _\\-.,;/&%#@!'(){}\\[\\]<>|=+?*~`]*$"
},
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new customPropertyName regex here allows additional characters (+ ? * ~ �backtick) and also implicitly disallows backslash (\\), but the PR description and some generated docs elsewhere mention only four disallowed characters and omit these allowed ones. Please either align the pattern with the intended charset, or update the PR/docs to reflect the actual rule (including whether \\ is intentionally disallowed).

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Tighten custom property name validation to block characters that break
downstream parsers, with verified empirical reproduction:

- `"` causes HTTP 500 on PUT /metadata/types/{id}
- `:` breaks CSV import — exporter writes `key:value;key:value`, importer
  splits at first colon, treats prefix as the field name
- `^` breaks OpenSearch query when the name is in
  searchSettings.searchFields — Lucene reads `^` as the boost separator
  in `field^boost`
- `$` breaks CSV import via java.util.regex.Matcher.replaceAll which
  interprets `$<letter>` as a backreference

Adds a `customPropertyName` definition in basic.json and switches
customProperty.json to reference it. Adds a defensive regex check in
TypeRepository.validateProperty so the API returns 400 with a clear
error message even if schema validation is bypassed.

Tests cover allowed-charset acceptance, the four blocked characters,
leading-character validation, max-length enforcement, and unbalanced
brackets.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 35 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

openmetadata-ui/src/main/resources/ui/src/components/common/SanitizedInput/SanitizedInput.tsx:32

  • handleChange now depends on shouldSanitize, but the callback memoization still only tracks onChange. If a parent toggles shouldSanitize after the component mounts, this handler will keep using the old mode and sanitize when it should not (or vice versa).
  const handleChange = useCallback(
    (e: ChangeEvent<HTMLInputElement>) => {
      const sanitizedValue = shouldSanitize
        ? getSanitizeContent(e.target.value)
        : e.target.value;
      if (onChange) {
        onChange({ ...e, target: { ...e.target, value: sanitizedValue } });
      }
    },
    [onChange]

Comment on lines +274 to +278
max: 128,
message: t('message.entity-size-in-between', {
entity: t('label.name'),
min: 1,
max: 128,
`^${optionTitle.replaceAll(/[.*+?^${}()|[\]\\]/g, '\\$&')}$`
),
})
.locator(`[title="${optionTitle}"]`)
Comment on lines +3633 to +3640
test('should show error when name exceeds 128 characters', async ({
page,
}) => {
await page.fill(nameInput, INVALID_NAMES.MAX_LENGTH);

await expect(page.locator(nameError)).toContainText(
NAME_MAX_LENGTH_VALIDATION_ERROR
);
…sertions

Schema is the single source of truth: jsonschema2pojo emits @pattern + @SiZe
on CustomProperty.name from basic.json#/definitions/customPropertyName, and
@Valid on TypeResource.addOrUpdateProperty enforces them at the HTTP boundary.
The hand-written Pattern constant, validateCustomPropertyName, and the
schema-vs-Java sync test were duplicating that rule and could never reach the
HTTP user (Bean Validation always fires first via @Valid).

Tighten the new TypeResourceIT cases from assertThrows(Exception.class) to
assertThrows(InvalidRequestException.class) so a regression to a different
exception type or status code fails loudly.
@sonika-shah sonika-shah force-pushed the fix/custom-property-name-charset-validation branch from 35d06c2 to 3c92e1b Compare May 4, 2026 11:38
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented May 4, 2026

Code Review ✅ Approved 1 resolved / 1 findings

Tightens custom property name validation to prevent downstream parser failures by blocking problematic characters like ", :, ^, $, and . Integration tests were refined to assert 400-level error responses, and redundant Java validator tests were removed.

✅ 1 resolved
Bug: assertEquals arguments are swapped (expected vs actual)

📄 openmetadata-service/src/test/java/org/openmetadata/service/jdbi3/TypeRepositoryTest.java:47-49 📄 openmetadata-service/src/test/java/org/openmetadata/service/jdbi3/TypeRepositoryTest.java:63-65
JUnit's assertEquals(expected, actual, message) convention puts the expected value first and the actual value second. In both customPropertyNamePatternMatchesSchema and customPropertyNameMaxLengthMatchesSchema, the schema value (loaded at runtime) is passed as expected and the Java constant is passed as actual. This is backwards — the Java constant is the known/expected value and the schema value is what's being verified. When the assertion fails, the message will read "expected: but was: " which is confusing since the stated intent is to check the schema stays in sync with Java.

This won't cause false passes/failures, but it will produce misleading failure messages.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 31 out of 33 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

openmetadata-ui/src/main/resources/ui/src/components/common/SanitizedInput/SanitizedInput.tsx:32

  • shouldSanitize is now part of this component's public API, but handleChange still only depends on onChange. If a caller toggles shouldSanitize without remounting, the callback keeps using the old mode and sanitizes (or skips sanitizing) incorrectly until the component is recreated.
  const handleChange = useCallback(
    (e: ChangeEvent<HTMLInputElement>) => {
      const sanitizedValue = shouldSanitize
        ? getSanitizeContent(e.target.value)
        : e.target.value;
      if (onChange) {
        onChange({ ...e, target: { ...e.target, value: sanitizedValue } });
      }
    },
    [onChange]

Comment on lines +274 to +278
max: 128,
message: t('message.entity-size-in-between', {
entity: t('label.name'),
min: 1,
max: 128,
type: FieldTypes.TEXT_MUI,
props: {
'data-testid': 'display-name',
shouldSanitize: false,
type: FieldTypes.TEXT,
props: {
'data-testid': 'display-name',
shouldSanitize: false,
Comment on lines +3633 to +3640
test('should show error when name exceeds 128 characters', async ({
page,
}) => {
await page.fill(nameInput, INVALID_NAMES.MAX_LENGTH);

await expect(page.locator(nameError)).toContainText(
NAME_MAX_LENGTH_VALIDATION_ERROR
);
Comment on lines +3625 to +3629
test('should accept a valid name with allowed special characters', async ({
page,
}) => {
await page.fill(nameInput, "valid Name_-./&%#@!,;=|'()<>[]{}");

Comment on lines +52 to +53
"description": "Name of the entity property. Must be unique for an entity. Allowed characters: alphanumeric, _ - . / & % # @ ! , ; = | ' + ? * ~ ` space ( ) < > [ ] { }. Must start with an alphanumeric character. Disallowed: \" : ^ $ \\.",
"$ref": "../type/basic.json#/definitions/customPropertyName"
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 4, 2026

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 4, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent validation for Custom Property names causes search failures with special characters

3 participants