Make analytics-engine an optional dependency #5403
Open
bowenlan-amzn wants to merge 1 commit intoopensearch-project:feature/mustang-ppl-integrationfrom
Open
Make analytics-engine an optional dependency #5403bowenlan-amzn wants to merge 1 commit intoopensearch-project:feature/mustang-ppl-integrationfrom
bowenlan-amzn wants to merge 1 commit intoopensearch-project:feature/mustang-ppl-integrationfrom
Conversation
Contributor
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit e271ff6.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
sql plugin previously declared analytics-engine as a hard dependency:
extendedPlugins = ['opensearch-job-scheduler', 'analytics-engine']
Installing opensearch-sql on a distribution that doesn't ship
analytics-engine failed with:
Missing plugin [analytics-engine], dependency of [opensearch-sql]
Marking the dep ;optional=true alone is not enough — TransportPPLQueryAction
Guice-injects QueryPlanExecutor on its constructor, and Guice's OpenSearch
fork rejects a required constructor parameter whose binding is missing at
injector-build time ("constructors cannot be optional").
Move QueryPlanExecutor from a required constructor parameter to an
@Inject(optional=true) setter. Guice invokes the setter only when a binding
for QueryPlanExecutor<RelNode, Iterable<Object[]>> exists — i.e. when
analytics-engine's createGuiceModules has run and bound DefaultPlanExecutor.
Absent analytics-engine, the setter is silently skipped,
unifiedQueryHandler stays null, and all PPL queries route to the v2
Calcite-to-OpenSearch path already in the sql plugin.
Drop the bundlePlugin exclude list. OpenSearch's jar-hell check skips the
extended-plugin cross-check when the dep is marked optional
(PluginsService.java:763), so sql can bundle every jar it needs to run
self-sufficiently. When analytics-engine is installed, parent-first
classloader delegation still lets analytics-engine's copies win for any
shared class; sql's bundled copies sit idle.
Promote analytics-framework.jar from compileOnly to api so
QueryPlanExecutor is reachable from sql's own classloader when the plugin
is absent. analytics-engine.jar stays compileOnly (required only for
OpenSearchSchemaBuilder, which lives in the engine plugin and is reached
only through RestUnifiedQueryAction — a class that never loads when the
setter is skipped).
Validated on a live 2-node cluster in both configurations:
- With analytics-engine installed: legacy and analytics PPL both return
expected rows; routing to the analytics path still fires for
parquet_-prefixed indices.
- Without analytics-engine (only opensearch-job-scheduler + opensearch-sql
installed): cluster starts cleanly, PPL and SQL queries against Lucene
indices return expected rows, parquet_-prefixed lookups return a clean
IndexNotFoundException instead of a NullPointerException or
NoClassDefFoundError.
Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com>
mch2
reviewed
May 4, 2026
| * Absent analytics-engine, Guice silently skips the call; {@link #unifiedQueryHandler} | ||
| * stays null and all PPL routing falls through to the legacy engine. | ||
| */ | ||
| @Inject(optional = true) |
280d3b4 to
e271ff6
Compare
ahkcs
approved these changes
May 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Makes the
analytics-engineplugin an optional dependency ofopensearch-sql. Installingopensearch-sqlon a distribution withoutanalytics-enginenow succeeds; all PPL/SQL queries route through the in-plugin v2 Calcite engine (Lucene-backed). Whenanalytics-engineis installed, the existing Mustang path (parquet-backed indices →QueryPlanExecutor) continues to work.Approach
extendedPlugins = ['opensearch-job-scheduler', 'analytics-engine;optional=true']QueryPlanExecutorfrom a constructor parameter to an@Inject(optional = true)setter. Guice invokes the setter only when the binding exists; absent analytics-engine it's silently skipped andunifiedQueryHandlerstays null. PPL routing then falls through to the v2 Calcite-to-OpenSearch path unconditionally.analytics-framework.jarfromcompileOnlytoapisoQueryPlanExecutor/SchemaProviderresolve from sql's own classpath whenanalytics-engineis absent.analytics-engine.jarstayscompileOnly— its classes (e.g.OpenSearchSchemaBuilder) are reached only throughRestUnifiedQueryAction, which never loads when the setter is skipped.bundlePlugin { exclude '*.jar' }block. OpenSearch's jar-hell check skips the extended-plugin cross-check for optional deps (PluginsService.java:763), so sql can bundle every transitive jar it needs to run standalone. Whenanalytics-engineis installed, parent-first classloader delegation still causes analytics-engine's copies to win for any shared class; sql's bundled copies sit idle. No class-identity drift.Addressing the diff analyzer findings
All three flagged items are intentional and follow documented OpenSearch plugin mechanisms:
;optional=truesyntax — parsed inPluginInfo.java:229("optional=true".equals(dependency[1])). Used in production elsewhere in the OpenSearch plugin ecosystem. Not ad-hoc syntax.apipromotion for analytics-framework — required so that sql's own classloader can resolveQueryPlanExecutor/SchemaProviderwhen analytics-engine is absent. When it IS present, the class-identity question is resolved by parent-first delegation: analytics-engine supplies the parent classloader (viaextendedPlugins), and its copy of every framework class wins. sql's bundled jar is dead code in that scenario.;optional=true,PluginsService.checkBundleJarHellskips that check (PluginsService.java:763). sql bundling them is now safe. Security-patched platform copies are not shadowed — sql ships its own copies only; nothing is overwritten on disk or in the classpath search path.Validation
Tested on a live 2-node cluster in both configurations:
With
analytics-engineinstalled:source=regular_test) → returns rows via Lucene path ✓source=parquet_test, 1 shard, parquet-backed) → returns 20 rows via Mustang path ✓Without
analytics-engine(onlyopensearch-job-scheduler+opensearch-sql):parquet_-prefixed lookup → cleanIndexNotFoundException, not NPE orNoClassDefFoundError✓Not in this PR
parquet_) is unchanged. A proper setting-based or data-format-based trigger is a separate discussion.analytics-engineside or in core OpenSearch. The mechanism uses only existing platform features.