Skip to content

Introduce FirstPassGroupingCollectorManager#15574

Open
gaobinlong wants to merge 3 commits intoapache:mainfrom
gaobinlong:firstPassManager
Open

Introduce FirstPassGroupingCollectorManager#15574
gaobinlong wants to merge 3 commits intoapache:mainfrom
gaobinlong:firstPassManager

Conversation

@gaobinlong
Copy link
Copy Markdown
Contributor

Description

This pr introduces FirstPassGroupingCollectorManager and switches TestGrouping and BaseGroupSelectorTestCase to use search concurrency and move away from the deprecated search(Query, Collector) method.

Relates to #12892.

@github-actions github-actions Bot added this to the 11.0.0 milestone Jan 15, 2026
@github-actions
Copy link
Copy Markdown
Contributor

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

@github-actions github-actions Bot added the Stale label Jan 30, 2026
Signed-off-by: Binlong Gao <gbinlong@amazon.com>

Format code

Signed-off-by: Binlong Gao <gbinlong@amazon.com>
@gaobinlong
Copy link
Copy Markdown
Contributor Author

Hi @dweiss , mind taking a look at this PR? I think it's simple.

Copy link
Copy Markdown
Contributor

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gaobinlong , I left some comments!

import org.apache.lucene.search.Sort;

/** A CollectorManager implementation for FirstPassGroupingCollector. */
public class FirstPassGroupingCollectorManager<T>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the corresponding collector is experimental, should this be too?

return new FirstPassGroupingCollector<>(
new ValueSourceGroupSelector(vs, new HashMap<>()), groupSort, topDocs);
return new FirstPassGroupingCollectorManager<>(
() -> new ValueSourceGroupSelector(vs, new HashMap<>()), groupSort, topDocs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the hashmap be created for each supplier or shared across them?

return new FirstPassGroupingCollector<>(
new ValueSourceGroupSelector(vs, new HashMap<>()), groupSort, topDocs);
return new FirstPassGroupingCollectorManager<>(
() -> new ValueSourceGroupSelector(vs, new HashMap<>()), groupSort, topDocs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above regarding the map


if (collectors.size() == 1) {
return collectors.iterator().next().getTopGroups(0);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a slight preference to remove these two above conditionals. I wonder how much value they bring.

public Collection<SearchGroup<T>> reduce(Collection<FirstPassGroupingCollector<T>> collectors)
throws IOException {
if (collectors.isEmpty()) {
return null;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning null here could cause issues downstream.

return SearchGroup.merge(allGroups, 0, topNGroups, groupSort);
}

public List<FirstPassGroupingCollector<T>> getCollectors() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks unused, can we remove it?

new TermGroupSelector(groupField), groupSort, topDocs);
return new FirstPassGroupingCollectorManager<>(
() -> new TermGroupSelector(groupField), groupSort, topDocs)
.newCollector();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can usages like these also be replaced? Hopefully we will do a search in the test using the collector retrieved via the collector manager? Can we do the search providing the collector manager instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants