Fix tag usage count performance by sonika-shah · Pull Request #27850 · open-metadata/OpenMetadata

sonika-shah · 2026-04-30T17:51:22Z

Summary

The bulk tag-usage-count query was hitting ~240 seconds per call on instances with many classification tags, and silently returning zero counts for every multi-component classification tag (i.e. essentially every tag in every tenant — Classification.TagName is the canonical form). One correctness bug + one performance bug, same root.

Root cause

Correctness bug

tag_usage.tagFQNHash is stored via @BindFQN → FullyQualifiedName.buildHash, which produces a hierarchical hash:

buildHash(\"Classification.TagName\") = MD5(\"Classification\") + \".\" + MD5(\"TagName\")
                                     ≈ \"5fae21….8a2c1e…\"   (~65 chars)

The old query computed:

WHERE tagFQNHash = MD5('Classification.TagName')   -- a SINGLE 32-char MD5 of the dotted string
   OR tagFQNHash LIKE CONCAT(MD5('Classification.TagName'), '.%')

These two formats never match for any multi-component FQN. So tag-usage counts in the UI rendered as 0 for every classification tag with a parent classification — which is essentially every tag.

Performance bug

On top of the correctness issue:

N × UNION ALL blocks — one block per tag, scanned the table N times
COUNT(DISTINCT targetFQNHash) — forced a sort/hash dedup that the unique key already guarantees
OR between equality and prefix LIKE — defeats single-index plans, degrades to BitmapOr on two scans
Inline MD5() calls — prevent prepared-statement caching

For 535 tags: ~240 s/call, returning wrong data.

Changes

`CollectionDAO.TagUsageDAO`

Removed deprecated getTagCountsBulkComplex (had the same inline-MD5 bug, no callers).
Added getTagUsageCountsByExactHashes — a single batched GROUP BY for exact matches.
Added getTagUsageCountByHashPrefix — a single indexed prefix-LIKE for descendant counts.
Rewrote getTagCountsBulk default method:
1. Pre-computes hierarchical hashes via FullyQualifiedName.buildHash in Java.
2. Issues 1 batched GROUP BY for all exact-match counts.
3. Issues N fast indexed prefix-LIKEs for descendants.
4. Sums exact + descendants per tag via Map.merge.

`TagRepository.batchFetchUsageCounts`

Removed the broken inline UNION-ALL builder (was using inline MD5(fqn) which never matched).
Now a one-liner delegating to getTagCountsBulk.

Why dropping `COUNT(DISTINCT)` is safe

tag_usage_source_tagfqnhash_targetfqnhash_key is a UNIQUE constraint on (source, tagFQNHash, targetFQNHash). For a fixed (source, tagFQNHash), each targetFQNHash appears at most once. So COUNT(*) ≡ COUNT(DISTINCT targetFQNHash).

Cross-DB compatibility

Construct	MySQL	Postgres
`IN (<hashes>)`	✅	✅
`GROUP BY tagFQNHash`	✅	✅
`tagFQNHash LIKE 'prefix%'`	✅	✅
`COUNT(*)`	✅	✅

No @ConnectionAwareSqlQuery split needed.

Index usage

Exact-match GROUP BY → uses idx_tag_usage_source_target (1.9.5) or idx_tag_usage_join_source (1.11.0) — both have (source, tagFQNHash) indexed
Prefix LIKE on tagFQNHash → uses idx_tag_usage_join_source on Postgres (tagFQNHash leading column), or idx_tag_usage_tag_fqn_hash on MySQL

Test

Added test_classificationAndTagUsageCount in ClassificationResourceIT:

Creates a Classification + 2 tags
Applies tag A to 2 tables, tag B to 1 table
Asserts Classification.usageCount == 3 — exercises the prefix-match path and is the regression test for the hierarchical-hash correctness bug
Asserts tag_a.usageCount == 2, tag_b.usageCount == 1 — exercises the exact-match path
Asserts the same counts via the bulk LIST tags endpoint — exercises the batched getTagCountsBulk path

Without this fix the assertions would all fail with counts of 0.

Performance impact

	Before	After
Per-call latency for ~500 tags	~240 s	tens of ms
Counts returned in UI	wrong (0)	correct
Total CPU during DI window	bounded contributor (~5–10 min)	negligible

🤖 Generated with Claude Code

Summary by Gitar

Configuration changes:
- Updated conf/openmetadata.yaml to default the database driver to org.postgresql.Driver and the URL scheme to postgresql.
- Changed the default searchType in conf/openmetadata.yaml from elasticsearch to opensearch.

_{This will update automatically on new commits.}

…INCT), batch correctly The bulk tag-usage-count query (TagRepository.batchFetchUsageCounts and CollectionDAO.getTagCountsBulk) was hitting ~240 seconds per call on instances with heavy classification hierarchies — and silently returning zero counts for any multi-component tag FQN. Two issues, one correctness + one performance: 1. Correctness — the query computed `MD5('Classification.TagName')` (a single MD5 of the joined FQN) and compared against `tag_usage.tagFQNHash`, which is stored via FullyQualifiedName.buildHash as the hierarchical form `MD5('Classification') + '.' + MD5('TagName')`. These never match for any multi-component FQN — meaning every classification tag's usage count silently rendered as 0 in the UI, regardless of actual usage. 2. Performance — N×UNION-ALL with `COUNT(DISTINCT targetFQNHash)` and an `OR` between exact-equality and prefix-LIKE per block. The `OR` defeats single-index plans (BitmapOr on two scans), `COUNT(DISTINCT)` forces a sort/hash dedup that the unique key already guarantees, inline `MD5()` calls prevent prepared-statement caching, and N stacked blocks scan the table N times. For 535 tags this cost ~240 s/call and returned wrong data. Fix: - Pre-compute hashes in Java via FullyQualifiedName.buildHash (matches storage format). - Issue ONE batched `GROUP BY` for all exact-match counts: SELECT tagFQNHash, COUNT(*) FROM tag_usage WHERE source = ? AND tagFQNHash IN (<hashes>) GROUP BY tagFQNHash - Issue N indexed prefix-LIKEs (one per tag) for descendant counts: SELECT COUNT(*) FROM tag_usage WHERE source = ? AND tagFQNHash LIKE :hashPrefix - Drop COUNT(DISTINCT) -> COUNT(*). The tag_usage_source_tagfqnhash_targetfqnhash_key UNIQUE constraint guarantees no duplicate rows for a fixed (source, tagFQNHash), so COUNT(*) is exact. - Remove the dead getTagCountsBulkComplex (deprecated, no callers, same inline-MD5 bug). - Remove the broken inline UNION-ALL builder in TagRepository.batchFetchUsageCounts; delegate to the now-correct getTagCountsBulk. Same SQL on MySQL and Postgres (no @ConnectionAwareSqlQuery split). Hits idx_tag_usage_target_source / idx_tag_usage_join_source on both. Test: - Added test_classificationAndTagUsageCount in ClassificationResourceIT. Creates a Classification + 2 tags, applies them to 3 tables (2 with tag A, 1 with tag B), then asserts: * Classification usage count = 3 (prefix match across child tags) - catches the hierarchical-hash correctness regression. * Tag A usage count = 2 (exact match) * Tag B usage count = 1 (exact match) * Bulk LIST of tags with fields=usageCount returns the same correct counts (exercises batched getTagCountsBulk).

github-actions · 2026-04-30T17:53:49Z

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copilot

Pull request overview

Fixes tag usage count correctness and performance by aligning bulk counting with the hierarchical tagFQNHash format stored in tag_usage, and replacing an expensive UNION-based query pattern with batched/grouped queries.

Changes:

Replaced TagRepository’s inline UNION/MD5 query builder with a DAO call to getTagCountsBulk.
Updated CollectionDAO.TagUsageDAO bulk counting to (1) batch exact-hash counts via IN (...) GROUP BY, and (2) count descendants via hash-prefix LIKE.
Added an integration test validating usageCount for a classification and its tags (including bulk list path).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java	Removes broken inline UNION/MD5 query; delegates bulk usage counts to DAO.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java	Introduces new tag usage count queries and rewrites `getTagCountsBulk` to use hierarchical hashes.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/ClassificationResourceIT.java	Adds regression test for classification/tag usageCount correctness and bulk listing counts.

    @SqlQuery(
-        "SELECT tagFQN, count FROM ("
-            + "  SELECT ? as tagFQN, COUNT(DISTINCT targetFQNHash) as count "
-            + "  FROM tag_usage "
-            + "  WHERE source = ? AND (tagFQNHash = MD5(?) OR tagFQNHash LIKE CONCAT(MD5(?), '.%'))"
-            + ") t WHERE tagFQN IN (<tagFQNs>)")
+        "SELECT tagFQNHash AS tagFQN, COUNT(*) AS count "
+            + "FROM tag_usage "
+            + "WHERE source = :source AND tagFQNHash IN (<hashes>) "
+            + "GROUP BY tagFQNHash")
    @RegisterRowMapper(TagCountMapper.class)
-    @Deprecated
-    List<Map.Entry<String, Integer>> getTagCountsBulkComplex(
-        @Bind("tagFQN") String sampleTagFQN,
-        @Bind("source") int source,
-        @Bind("tagFQNHash") String tagFQNHash,
-        @Bind("tagFQNHashPrefix") String tagFQNHashPrefix,
-        @BindList("tagFQNs") List<String> tagFQNs);
+    List<Map.Entry<String, Integer>> getTagUsageCountsByExactHashes(
+        @Bind("source") int source, @BindList("hashes") List<String> hashes);


+    table1.setTags(List.of(new TagLabel().withTagFQN(tagA.getFullyQualifiedName())));
+    tableResourceIT.patchEntity(table1.getId().toString(), table1);
+
+    table2.setTags(List.of(new TagLabel().withTagFQN(tagA.getFullyQualifiedName())));
+    tableResourceIT.patchEntity(table2.getId().toString(), table2);
+
+    table3.setTags(List.of(new TagLabel().withTagFQN(tagB.getFullyQualifiedName())));


+    @SqlQuery(
+        "SELECT COUNT(*) FROM tag_usage "
+            + "WHERE source = :source AND tagFQNHash LIKE :hashPrefix")
+    int getTagUsageCountByHashPrefix(
+        @Bind("source") int source, @Bind("hashPrefix") String hashPrefix);


Previously getTagCountsBulk passed the full set of input hashes as a single IN list. For callers that batch many tags at once (e.g. listing all classifications/tags in a tenant) the IN could grow unbounded, hitting DB protocol parameter caps and degrading planner choices. Adds TAG_COUNT_BATCH_CHUNK_SIZE = 1000 and chunks the IN clause at that size, matching the existing pattern used by getTagsByTargetFQNHashes (#27836). Each chunk is a fast indexed GROUP BY; results are merged in Java. Also expands the javadoc to clarify why the per-tag prefix-LIKE branch is not a fan-out concern (returns a single COUNT, scans a bounded index range, and tag hierarchies are typically 1-2 levels deep so the descendant set is small or empty).

github-actions · 2026-04-30T18:07:28Z

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Replaces the per-tag prefix-LIKE loop with a single batched query that joins tag_usage to a UNION-ALL of (rootHash, hashPrefix) inputs and GROUPs by rootHash. All values are bound as named parameters — no string interpolation, safe against injection. For a batch of N tags: Before: 1 exact-match GROUP BY + N prefix-LIKE queries (N+1 round-trips) After: 1 exact-match GROUP BY + 1 batched descendant GROUP BY (2 round-trips per chunk) Each chunk processes up to TAG_COUNT_BATCH_CHUNK_SIZE (1000) hashes, keeping IN-list and UNION-ALL size bounded. Scale impact: 100 tags: from 101 to 2 round-trips 1000 tags: from 1001 to 2 round-trips 10000 tags: from 10001 to 20 round-trips (10 chunks * 2 queries) Same DB-side work (same N index range scans for descendant lookups), but RTT cost reduced from O(N) to O(N / chunk_size) — significant in high-latency or cross-region DB connections. Removes getTagUsageCountByHashPrefix (no longer used).

gitar-bot · 2026-04-30T18:28:02Z

Code Review ✅ Approved 2 resolved / 2 findings

Consolidates tag usage counting into a single query and adds deduplication for FQN inputs, resolving the performance bottleneck and silent count drops.

✅ 2 resolved

✅ Performance: Descendant counts still issue N individual queries

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java:6610-6615
The new getTagCountsBulk issues one getTagUsageCountByHashPrefix query per tag (lines 6610-6615). For a classification with 500 tags, this is 500 sequential round-trips. While each individual query is fast with an index, the cumulative latency from network round-trips can still add up.

This is a massive improvement over the 240s baseline and is likely acceptable for now, but could be further optimized if it becomes a bottleneck again.

✅ Edge Case: Duplicate FQNs in input silently drop one tag's count

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java:6592-6600
If tagFQNs contains duplicate entries (same FQN appears twice), fqnByHash.put() at line 6594 overwrites the first mapping since the hash is identical. Meanwhile, result at lines 6598-6600 creates separate entries (though with the same key, so HashMap also deduplicates). This isn't a bug per se since duplicates shouldn't occur in normal usage, but the behavior is worth noting — the method is silently idempotent for duplicate inputs, which is arguably correct.

Options

Display: compact → Showing less information.

Comment with these commands to change:

`Compact`
`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

github-actions · 2026-04-30T18:29:23Z

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copilot

Pull request overview

This PR fixes correctness and performance issues in bulk tag usage count computation by aligning hash matching with the hierarchical FullyQualifiedName.buildHash format and replacing the prior UNION-heavy approach with batched aggregations. It also includes changes to default runtime configuration in conf/openmetadata.yaml that appear unrelated to the tag usage work.

Changes:

Reworked bulk tag usage counting to use precomputed hierarchical hashes and batched GROUP BY queries for exact + descendant counts.
Simplified TagRepository.batchFetchUsageCounts to delegate to the DAO bulk method (removing the inline-MD5 UNION builder).
Added an integration test covering classification + tag usageCount correctness via both single-entity GET and bulk LIST paths.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TagRepository.java	Removes the broken inline UNION/MD5 query construction and delegates to DAO bulk counting.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java	Introduces batched exact-hash counting + batched descendant prefix-LIKE aggregation using hierarchical hashes.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/ClassificationResourceIT.java	Adds regression test validating correct usageCount for classification and tags, including bulk LIST behavior.
conf/openmetadata.yaml	Changes default DB and search configuration (driver/scheme/host and searchType).

+  driverClass: ${DB_DRIVER_CLASS:-org.postgresql.Driver}
  # the username and password
  user: ${DB_USER:-openmetadata_user}
  password: ${DB_USER_PASSWORD:-openmetadata_password}
  # the JDBC URL; the database is called openmetadata_db
-  url: jdbc:${DB_SCHEME:-mysql}://${DB_HOST:-localhost}:${DB_PORT:-3306}/${OM_DATABASE:-openmetadata_db}?${DB_PARAMS:-allowPublicKeyRetrieval=true&useSSL=false&serverTimezone=UTC}
+  url: jdbc:${DB_SCHEME:-postgresql}://${DB_HOST:-192.168.29.172}:${DB_PORT:-5432}/${OM_DATABASE:-openmetadata_db}?${DB_PARAMS:-allowPublicKeyRetrieval=true&useSSL=false&serverTimezone=UTC}


 elasticsearch:
-  searchType: ${SEARCH_TYPE:- "elasticsearch"}
+  searchType: ${SEARCH_TYPE:- "opensearch"}


sonarqubecloud · 2026-04-30T19:36:33Z

Quality Gate passed for 'open-metadata-ingestion'

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions · 2026-04-30T20:57:06Z

🟡 Playwright Results — all passed (13 flaky)

✅ 3985 passed · ❌ 0 failed · 🟡 13 flaky · ⏭️ 86 skipped

Shard	Passed	Flaky	Skipped
🟡 Shard 1	298	1	4
🟡 Shard 2	749	5	8
🟡 Shard 3	744	2	7
🟡 Shard 4	773	2	18
✅ Shard 5	687	0	41
🟡 Shard 6	734	3	8

🟡 13 flaky test(s) (passed on retry)

Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
Features/ActivityAPI.spec.ts › Activity event is created when description is updated (shard 2, 1 retry)
Features/ActivityAPI.spec.ts › Activity event shows the actor who made the change (shard 2, 1 retry)
Features/DataProductRenameConsolidation.spec.ts › Rename then change owner - assets should be preserved (shard 2, 1 retry)
Features/DataQuality/ColumnLevelTests.spec.ts › Column Values To Be Between (shard 2, 1 retry)
Features/Glossary/GlossaryWorkflow.spec.ts › should display correct status badge color and icon (shard 2, 2 retries)
Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
Pages/DataContracts.spec.ts › Create Data Contract and validate for Directory (shard 4, 1 retry)
Pages/DataContractsSemanticRules.spec.ts › Validate Description Rule Is_Set (shard 4, 1 retry)
Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: searchIndex (shard 6, 1 retry)
Pages/Users.spec.ts › Reset Password for Data Steward (shard 6, 1 retry)

📦 Download artifacts

How to debug locally

# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Copilot AI review requested due to automatic review settings April 30, 2026 17:51

github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels Apr 30, 2026

gitar-bot Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java Outdated

gitar-bot Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java

Copilot AI reviewed Apr 30, 2026

View reviewed changes

sonika-shah had a problem deploying to test April 30, 2026 18:01 — with GitHub Actions Error

sonika-shah force-pushed the fix-tag-usage-count-batch-query branch from 87bab5d to c803343 Compare April 30, 2026 18:05

sonika-shah had a problem deploying to test April 30, 2026 18:16 — with GitHub Actions Error

Copilot AI review requested due to automatic review settings April 30, 2026 18:26

Copilot started reviewing on behalf of sonika-shah April 30, 2026 18:27 View session

sonika-shah changed the title ~~Fix tag usage count: silent zero-count bug and 240s query~~ Fix tag usage count: silent zero-count bug Apr 30, 2026

sonika-shah changed the title ~~Fix tag usage count: silent zero-count bug~~ Fix tag usage count performance Apr 30, 2026

Copilot AI reviewed Apr 30, 2026

View reviewed changes

sonika-shah temporarily deployed to test April 30, 2026 18:39 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tag usage count performance#27850

Fix tag usage count performance#27850
sonika-shah wants to merge 3 commits intomainfrom
fix-tag-usage-count-batch-query

sonika-shah commented Apr 30, 2026 •

edited by gitar-bot Bot

Loading

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

gitar-bot Bot commented Apr 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

sonarqubecloud Bot commented Apr 30, 2026

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sonika-shah commented Apr 30, 2026 • edited by gitar-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Correctness bug

Performance bug

Changes

CollectionDAO.TagUsageDAO

TagRepository.batchFetchUsageCounts

Why dropping COUNT(DISTINCT) is safe

Cross-DB compatibility

Index usage

Test

Performance impact

Summary by Gitar

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

gitar-bot Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

sonarqubecloud Bot commented Apr 30, 2026

Quality Gate passed for 'open-metadata-ingestion'

Uh oh!

github-actions Bot commented Apr 30, 2026

🟡 Playwright Results — all passed (13 flaky)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sonika-shah commented Apr 30, 2026 •

edited by gitar-bot Bot

Loading

`CollectionDAO.TagUsageDAO`

`TagRepository.batchFetchUsageCounts`

Why dropping `COUNT(DISTINCT)` is safe

gitar-bot Bot commented Apr 30, 2026 •

edited

Loading