perf(ingester): lazy regex evaluation on head postings cache miss by alanprot · Pull Request #7553 · cortexproject/cortex

alanprot · 2026-05-22T17:33:08Z

Lazy regex evaluation on head postings cache miss

When the expanded postings cache misses on the head block, regex matchers on high-cardinality labels (e.g., pod with 400K+ values) dominate query cost — the regex runs against every label value to build the posting list.
This PR defers expensive regex matchers to a lazy per-series evaluation when a selective equality matcher (like __name__=) already narrows the result set significantly.

How it works

On cache miss, splitMatchersForHeadWithConfig splits matchers into:

Selective matchers (equality, low-card regex) → used for postings lookup
Lazy matchers (high-card regex) → applied per-series via LabelValueFor after the selective postings are resolved
A cost-ratio gate decides when deferral is worthwhile:
Simple regex (single contains, prefix): deferred when cardinality > selectivePostings × 6
Complex regex (multi-substring, capture groups): deferred when cardinality > selectivePostings × 2
Label cardinality lookups are cached in an expirable LRU (60s TTL) to avoid repeated LabelValues calls under load.

Benchmark results (realistic pod names, 413K cardinality, 9K selective postings)

Path	Time	Memory
Eager (before)	62 ms	29.8 MB
Lazy (this PR)	14 ms	12.6 MB
4.5× faster, 58% less memory per query.

Configuration

Three new flags (all disabled by default — max-cardinality=0):

-blocks-storage.expanded_postings_cache.head.lazy-matcher-max-cardinality
-blocks-storage.expanded_postings_cache.head.lazy-matcher-simple-cost-ratio
-blocks-storage.expanded_postings_cache.head.lazy-matcher-complex-cost-ratio

Testing

Unit tests for the gate logic and cost classification
Integration fuzz test (TestLazyMatchersFuzz): 300 fuzzed queries + injected regex patterns compared between eager and lazy instances — 450+ lazy triggers, zero mismatches
Correctness verified by intentionally breaking the filter and confirming the test catches it (445 failures)
Checklist
Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
docs/configuration/v1-guarantees.md updated if this PR introduces experimental flags

When the expanded postings cache misses on the head block, regex matchers on high-cardinality labels (e.g. pod with 400K+ values) dominate query cost. This PR defers expensive regex matchers to a lazy per-series evaluation when a selective equality matcher already narrows the result set significantly. On cache miss, splitMatchersForHeadWithConfig splits matchers into: - Selective matchers (equality, low-card regex) for postings lookup - Lazy matchers (high-card regex) applied per-series via LabelValueFor A cost-ratio gate decides when deferral is worthwhile: - Simple regex (single contains, prefix): cardinality > selectivePostings * 6 - Complex regex (multi-substring, capture groups): cardinality > selectivePostings * 2 Label cardinality lookups are cached in an expirable LRU (60s TTL) to avoid repeated LabelValues calls under load. Benchmark (realistic pod names, 413K cardinality, 9K selective postings): - Eager: 62ms, 29.8MB per query - Lazy: 14ms, 12.6MB per query (4.5x faster, 58% less memory) New flags (disabled by default with max-cardinality=0): - blocks-storage.expanded_postings_cache.head.lazy-matcher-max-cardinality - blocks-storage.expanded_postings_cache.head.lazy-matcher-simple-cost-ratio - blocks-storage.expanded_postings_cache.head.lazy-matcher-complex-cost-ratio

pull-request-size Bot added the size/XXL label May 22, 2026

alanprot force-pushed the lazy-posting branch from 48756ec to dee1a48 Compare May 22, 2026 17:37

alanprot marked this pull request as ready for review May 22, 2026 17:43

dosubot Bot added component/ingester storage/blocks Blocks storage engine type/performance labels May 22, 2026

alanprot force-pushed the lazy-posting branch from dee1a48 to a838d2b Compare May 22, 2026 19:02

alanprot force-pushed the lazy-posting branch from a838d2b to 5f72154 Compare May 22, 2026 20:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(ingester): lazy regex evaluation on head postings cache miss#7553

perf(ingester): lazy regex evaluation on head postings cache miss#7553
alanprot wants to merge 1 commit into
cortexproject:masterfrom
alanprot:lazy-posting

alanprot commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alanprot commented May 22, 2026

Lazy regex evaluation on head postings cache miss

How it works

Benchmark results (realistic pod names, 413K cardinality, 9K selective postings)

Configuration

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant