Skip to content

Integrate syntax errors with error report#5371

Draft
ritvibhatt wants to merge 22 commits intoopensearch-project:mainfrom
ritvibhatt:syntax-exception-error-message
Draft

Integrate syntax errors with error report#5371
ritvibhatt wants to merge 22 commits intoopensearch-project:mainfrom
ritvibhatt:syntax-exception-error-message

Conversation

@ritvibhatt
Copy link
Copy Markdown
Contributor

@ritvibhatt ritvibhatt commented Apr 20, 2026

Description

Integrates syntax errors with the error reporting infrastructure and adds suggestions for common syntax error patterns.

Suggestion System

  • Added SyntaxErrorSuggestionRegistry with pattern-based providers:
    • SelectStarSuggestionProvider: Suggests PPL syntax when SQL is used in PPL context
    • UnmatchedParenthesesSuggestionProvider: Detects parentheses mismatches
    • UnquotedTableNameSuggestionProvider: Suggests backticks for special characters in table names
    • ExpectedTokensSuggestionProvider: Falls back to ANTLR's expected tokens

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 20, 2026

PR Reviewer Guide 🔍

(Review updated until commit 24d398c)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Introduce syntax error suggestion provider framework

Relevant files:

  • common/src/main/java/org/opensearch/sql/common/antlr/suggestion/SyntaxErrorContext.java
  • common/src/main/java/org/opensearch/sql/common/antlr/suggestion/SyntaxErrorSuggestionProvider.java
  • common/src/main/java/org/opensearch/sql/common/antlr/suggestion/SyntaxErrorSuggestionRegistry.java
  • common/src/main/java/org/opensearch/sql/common/antlr/suggestion/SelectStarSuggestionProvider.java
  • common/src/main/java/org/opensearch/sql/common/antlr/suggestion/UnmatchedParenthesesSuggestionProvider.java
  • common/src/main/java/org/opensearch/sql/common/antlr/suggestion/UnquotedTableNameSuggestionProvider.java
  • common/src/main/java/org/opensearch/sql/common/antlr/suggestion/ExpectedTokensSuggestionProvider.java
  • sql/src/test/java/org/opensearch/sql/sql/antlr/suggestion/ContextFactory.java
  • sql/src/test/java/org/opensearch/sql/sql/antlr/suggestion/ExpectedTokensSuggestionProviderTest.java
  • sql/src/test/java/org/opensearch/sql/sql/antlr/suggestion/SyntaxErrorSuggestionRegistryTest.java
  • sql/src/test/java/org/opensearch/sql/sql/antlr/suggestion/UnquotedTableNameSuggestionProviderTest.java

Sub-PR theme: Replace SyntaxCheckException with ErrorReport across parsers and handlers

Relevant files:

  • common/src/main/java/org/opensearch/sql/common/antlr/SyntaxAnalysisErrorListener.java
  • api/src/main/java/org/opensearch/sql/api/UnifiedQueryPlanner.java
  • async-query-core/src/main/java/org/opensearch/sql/spark/utils/SQLQueryUtils.java
  • legacy/src/main/java/org/opensearch/sql/legacy/plugin/RestSQLQueryAction.java
  • api/src/test/java/org/opensearch/sql/api/UnifiedQueryPlannerTest.java
  • api/src/test/java/org/opensearch/sql/api/parser/UnifiedQueryParserTest.java
  • legacy/src/test/java/org/opensearch/sql/legacy/plugin/RestSQLQueryActionTest.java
  • ppl/src/test/java/org/opensearch/sql/ppl/antlr/PPLSyntaxParserTest.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLReplaceTest.java
  • ppl/src/test/java/org/opensearch/sql/ppl/parser/AstBuilderTest.java
  • ppl/src/test/java/org/opensearch/sql/ppl/parser/AstExpressionBuilderTest.java
  • sql/src/test/java/org/opensearch/sql/common/antlr/SyntaxParserTestBase.java
  • sql/src/test/java/org/opensearch/sql/sql/antlr/SQLSyntaxParserTest.java

⚡ Recommended focus areas for review

Global Mutable State

The PROVIDERS list is a static mutable field. Tests that call register() will permanently add providers to the global registry, potentially causing test pollution and non-deterministic behavior across test runs. There is no way to reset or unregister providers.

private static final CopyOnWriteArrayList<SyntaxErrorSuggestionProvider> PROVIDERS =
    new CopyOnWriteArrayList<>();

static {
  register(
      new SelectStarSuggestionProvider(),
      new UnmatchedParenthesesSuggestionProvider(),
      new UnquotedTableNameSuggestionProvider(),
      new ExpectedTokensSuggestionProvider());
}

private SyntaxErrorSuggestionRegistry() {}

public static void register(SyntaxErrorSuggestionProvider... providers) {
  PROVIDERS.addAll(Arrays.asList(providers));
  PROVIDERS.sort(Comparator.comparingInt(SyntaxErrorSuggestionProvider::getPriority));
Weakened Test Assertion

The test previously asserted exception instanceof SyntaxCheckException and now asserts exception instanceof ErrorReport. Since ErrorReport is a broader type, this change may hide regressions where a different ErrorReport subtype (unrelated to syntax errors) triggers the fallback handler. Consider also asserting the error code is ErrorCode.SYNTAX_ERROR.

assertTrue(exception instanceof ErrorReport);
Grammar File Name Coupling

The PPL parser detection relies on getGrammarFileName().contains("PPL"), which is fragile and tightly coupled to the grammar file naming convention. If the grammar file is renamed or the check is applied to a different parser, this condition will silently fail to match.

  || !context.getRecognizer().getGrammarFileName().contains("PPL")) {
return List.of();
Only First Suggestion Used

The registry can return multiple suggestions per provider, but only customSuggestions.get(0) is ever used. If a provider returns multiple useful suggestions, all but the first are silently discarded. Consider whether all suggestions should be included in the ErrorReport.

if (!customSuggestions.isEmpty()) {
  // Use the first suggestion from the registry
  reportBuilder.suggestion(customSuggestions.get(0));
}
Test Side Effect

The test registers StubProvider instances into the global static SyntaxErrorSuggestionRegistry. These providers persist for the lifetime of the JVM and will affect other tests that rely on findSuggestions. This can cause flaky or order-dependent test failures.

SyntaxErrorSuggestionRegistry.register(highPrioritySuggestion, lowPrioritySuggestion);

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 20, 2026

PR Code Suggestions ✨

Latest suggestions up to 24d398c

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent legacy exception from being swallowed by generic catch

ErrorReport is now thrown by the parser instead of SyntaxCheckException, so the
SyntaxCheckException branch will never be reached for parser errors. More
importantly, ErrorReport extends RuntimeException (assumed), so it would already be
re-thrown by the catch (Exception e) branch wrapped in IllegalStateException, losing
the original structured error. Keeping SyntaxCheckException in the multi-catch while
it is no longer thrown directly is misleading; verify whether it can still be thrown
from non-parser paths, and if not, remove it.

api/src/main/java/org/opensearch/sql/api/UnifiedQueryPlanner.java [65-69]

-} catch (SyntaxCheckException | UnsupportedOperationException | ErrorReport e) {
+} catch (UnsupportedOperationException | ErrorReport e) {
   throw e;
+} catch (SyntaxCheckException e) {
+  // Legacy path: re-wrap as ErrorReport for uniform handling
+  throw ErrorReport.wrap(e).code(ErrorCode.SYNTAX_ERROR).build();
 } catch (Exception e) {
   throw new IllegalStateException("Failed to plan query", e);
 }
Suggestion importance[1-10]: 5

__

Why: The concern about SyntaxCheckException being a dead branch is valid since the parser now throws ErrorReport instead. The improved_code introduces a re-wrapping path for SyntaxCheckException to ErrorReport, which is a reasonable approach to maintain uniform error handling. However, it references ErrorCode without confirming the import exists in this file.

Low
Fix non-atomic sort on concurrent provider registration

The register method sorts the entire PROVIDERS list after every call, but the static
initializer already registers the default providers and any subsequent register call
(e.g., from tests) will re-sort the combined list. Because CopyOnWriteArrayList.sort
is not atomic, concurrent register calls can produce an inconsistent ordering.
Consider collecting into a regular list, sorting, then doing a single
addAllAbsent-style replacement, or use a TreeMap/PriorityQueue keyed by priority.

common/src/main/java/org/opensearch/sql/common/antlr/suggestion/SyntaxErrorSuggestionRegistry.java [29-32]

 public static void register(SyntaxErrorSuggestionProvider... providers) {
-  PROVIDERS.addAll(Arrays.asList(providers));
-  PROVIDERS.sort(Comparator.comparingInt(SyntaxErrorSuggestionProvider::getPriority));
+  List<SyntaxErrorSuggestionProvider> toAdd = Arrays.asList(providers);
+  PROVIDERS.addAll(toAdd);
+  // Sort a snapshot and replace atomically to avoid partial-sort visibility
+  List<SyntaxErrorSuggestionProvider> sorted = new ArrayList<>(PROVIDERS);
+  sorted.sort(Comparator.comparingInt(SyntaxErrorSuggestionProvider::getPriority));
+  // Replace contents (CopyOnWriteArrayList has no atomic replaceAll, so use clear+addAll under lock)
+  synchronized (PROVIDERS) {
+    PROVIDERS.clear();
+    PROVIDERS.addAll(sorted);
+  }
 }
Suggestion importance[1-10]: 4

__

Why: The concurrency concern is valid since CopyOnWriteArrayList.sort is not atomic with respect to concurrent register calls. However, in practice this registry is only populated at static initialization and in tests, making true concurrent registration unlikely. The suggested fix using synchronized blocks alongside CopyOnWriteArrayList is also somewhat inconsistent.

Low
General
Remove unreachable legacy exception branch

Since SyntaxAnalysisErrorListener now always throws ErrorReport (wrapping
SyntaxCheckException) instead of SyntaxCheckException directly, the first branch e
instanceof SyntaxCheckException will never be true for parser-originated errors. The
dead branch is harmless today but misleading and could mask future regressions.
Consider removing the SyntaxCheckException branch or replacing it with a check on
the wrapped cause to keep the intent clear.

legacy/src/main/java/org/opensearch/sql/legacy/plugin/RestSQLQueryAction.java [136-143]

-if (e instanceof SyntaxCheckException
-    || e instanceof UnsupportedCursorRequestException
+if (e instanceof UnsupportedCursorRequestException
     || (e instanceof ErrorReport
-        && ((ErrorReport) e).getCode() == ErrorCode.SYNTAX_ERROR)) {
+        && ((ErrorReport) e).getCode() == ErrorCode.SYNTAX_ERROR)
+    || (e instanceof SyntaxCheckException)) {  // kept only for non-parser callers
   fallBackHandler.accept(channel, e);
Suggestion importance[1-10]: 4

__

Why: The observation that SyntaxCheckException is no longer thrown directly by the parser (now wrapped in ErrorReport) is accurate, making that branch potentially dead code. However, the improved_code doesn't actually remove the branch — it just reorders it with a comment, which doesn't address the stated concern.

Low
Use class name instead of grammar file name for robustness

getGrammarFileName() returns the .g4 source file name baked into the generated
parser (e.g. "OpenSearchPPLParser.g4"). This string comparison is fragile: a grammar
rename or a different Calcite-based recognizer would silently stop returning
suggestions. Consider checking the recognizer's class name instead, or exposing a
typed marker on the context.

common/src/main/java/org/opensearch/sql/common/antlr/suggestion/SelectStarSuggestionProvider.java [20-23]

-if (!context.getRecognizer().getGrammarFileName().contains("PPL")) {
+if (!context.getRecognizer().getClass().getSimpleName().contains("PPL")) {
   return List.of();
 }
Suggestion importance[1-10]: 4

__

Why: The suggestion to use getClass().getSimpleName() instead of getGrammarFileName() is a reasonable robustness improvement, as grammar file names could change. However, both approaches are string-based and similarly fragile; a typed marker would be more robust but requires more refactoring.

Low

Previous suggestions

Suggestions up to commit f02c2a9
CategorySuggestion                                                                                                                                    Impact
Possible issue
Isolate static registry state between tests

This test registers providers into the shared static SyntaxErrorSuggestionRegistry
without cleanup, which will pollute the registry for all subsequent tests in the
same JVM run. The stub providers always return suggestions regardless of context, so
they will interfere with other tests that rely on findSuggestions. Add a @AfterEach
or @BeforeEach reset step using a package-private reset method.

sql/src/test/java/org/opensearch/sql/sql/antlr/suggestion/SyntaxErrorSuggestionRegistryTest.java [20-23]

+@org.junit.jupiter.api.BeforeEach
+void resetRegistry() {
+  SyntaxErrorSuggestionRegistry.reset(); // package-private helper
+}
+
+@Test
 void lowerPriorityProviderWinsOverHigherPriorityProvider() {
   StubProvider lowPrioritySuggestion = new StubProvider("low-wins", 1);
   StubProvider highPrioritySuggestion = new StubProvider("high-loses", Integer.MAX_VALUE - 1);
   SyntaxErrorSuggestionRegistry.register(highPrioritySuggestion, lowPrioritySuggestion);
Suggestion importance[1-10]: 7

__

Why: The test registers stub providers into the shared static SyntaxErrorSuggestionRegistry without cleanup, which pollutes the registry for subsequent tests. Since the stubs always return suggestions regardless of context, this is a real test isolation issue that could cause flaky tests.

Medium
Prevent duplicate provider accumulation in registry

The register method adds providers to a shared static CopyOnWriteArrayList without
any deduplication check. In tests (e.g., SyntaxErrorSuggestionRegistryTest), calling
register multiple times will keep accumulating providers, potentially causing stale
or duplicate suggestions across test runs. Consider adding a guard or providing a
way to reset the registry in tests.

common/src/main/java/org/opensearch/sql/common/antlr/suggestion/SyntaxErrorSuggestionRegistry.java [25-28]

 public static void register(SyntaxErrorSuggestionProvider... providers) {
-  PROVIDERS.addAll(Arrays.asList(providers));
+  for (SyntaxErrorSuggestionProvider p : providers) {
+    if (!PROVIDERS.contains(p)) {
+      PROVIDERS.add(p);
+    }
+  }
   PROVIDERS.sort(Comparator.comparingInt(SyntaxErrorSuggestionProvider::getPriority));
 }
 
+/** Visible for testing only. Resets the registry to its initial state. */
+static void reset() {
+  PROVIDERS.clear();
+  register(new UnquotedTableNameSuggestionProvider(), new ExpectedTokensSuggestionProvider());
+}
+
Suggestion importance[1-10]: 6

__

Why: The static PROVIDERS list accumulates providers on every register() call without deduplication, which can cause test pollution and duplicate suggestions. Adding a reset() method and deduplication guard would improve test isolation and correctness.

Low
General
Remove duplicated fallback suggestion logic

The SyntaxErrorSuggestionRegistry already includes ExpectedTokensSuggestionProvider
as a fallback (with Integer.MAX_VALUE priority), so the manual fallback logic in
SyntaxAnalysisErrorListener duplicates that behavior. This duplication can lead to
inconsistent suggestion formatting and makes the registry pattern redundant. Remove
the manual else if (e != null) fallback block and rely solely on the registry.

common/src/main/java/org/opensearch/sql/common/antlr/SyntaxAnalysisErrorListener.java [79-95]

 if (!customSuggestions.isEmpty()) {
   // Use the first suggestion from the registry
   reportBuilder.suggestion(customSuggestions.get(0));
-} else if (e != null) {
-  // Fall back to expected tokens as suggestion if no pattern matches
-  IntervalSet possibleContinuations = e.getExpectedTokens();
-  List<String> suggestions = topSuggestions(recognizer, possibleContinuations);
-  if (!suggestions.isEmpty()) {
-    String suggestionText =
-        possibleContinuations.size() > SUGGESTION_TRUNCATION_THRESHOLD
-            ? String.format(
-                "Expected one of %d possible tokens. Examples: %s",
-                possibleContinuations.size(), String.join(", ", suggestions))
-            : "Expected tokens: " + String.join(", ", suggestions);
-    reportBuilder.suggestion(suggestionText);
-  }
 }
Suggestion importance[1-10]: 6

__

Why: The ExpectedTokensSuggestionProvider is already registered in SyntaxErrorSuggestionRegistry as a fallback, making the manual else if (e != null) block in SyntaxAnalysisErrorListener redundant. However, the formatting differs slightly between the two implementations, so removing it requires verifying behavioral equivalence.

Low
Align fallback handler with new exception hierarchy

Since SyntaxAnalysisErrorListener now always throws ErrorReport (wrapping
SyntaxCheckException as the cause) instead of SyntaxCheckException directly, the e
instanceof SyntaxCheckException branch will never be true for parser-generated
errors. The SyntaxCheckException check should be removed or replaced with a check on
the cause to avoid dead code and potential missed fallbacks.

legacy/src/main/java/org/opensearch/sql/legacy/plugin/RestSQLQueryAction.java [136-143]

-if (e instanceof SyntaxCheckException
-    || e instanceof UnsupportedCursorRequestException
+if (e instanceof UnsupportedCursorRequestException
     || (e instanceof ErrorReport
-        && ((ErrorReport) e).getCode() == ErrorCode.SYNTAX_ERROR)) {
+        && ((ErrorReport) e).getCode() == ErrorCode.SYNTAX_ERROR)
+    || (e instanceof SyntaxCheckException)) {
   fallBackHandler.accept(channel, e);
 }
Suggestion importance[1-10]: 4

__

Why: The e instanceof SyntaxCheckException check may be dead code since SyntaxAnalysisErrorListener now throws ErrorReport instead of SyntaxCheckException directly. However, the improved code in the suggestion keeps SyntaxCheckException as a separate branch, which doesn't meaningfully change the logic and the original code already handles this via ErrorReport.

Low
Suggestions up to commit 2f16563
CategorySuggestion                                                                                                                                    Impact
General
Remove duplicated fallback suggestion logic

The ExpectedTokensSuggestionProvider is already registered in
SyntaxErrorSuggestionRegistry as a fallback provider, so the else if (e != null)
branch in SyntaxAnalysisErrorListener duplicates the fallback logic. This means the
expected-tokens suggestion could be generated twice or the registry's
ExpectedTokensSuggestionProvider is redundant. The fallback logic should be handled
exclusively by the registry to avoid duplication.

common/src/main/java/org/opensearch/sql/common/antlr/SyntaxAnalysisErrorListener.java [79-95]

+List<String> customSuggestions = SyntaxErrorSuggestionRegistry.findSuggestions(context);
 if (!customSuggestions.isEmpty()) {
-  // Use the first suggestion from the registry
   reportBuilder.suggestion(customSuggestions.get(0));
-} else if (e != null) {
-  // Fall back to expected tokens as suggestion if no pattern matches
-  IntervalSet possibleContinuations = e.getExpectedTokens();
-  List<String> suggestions = topSuggestions(recognizer, possibleContinuations);
-  ...
 }
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that the else if (e != null) branch in SyntaxAnalysisErrorListener duplicates the ExpectedTokensSuggestionProvider logic already registered in SyntaxErrorSuggestionRegistry. This redundancy could lead to inconsistent behavior and should be consolidated.

Low
Fix non-atomic concurrent registration race condition

The register method uses CopyOnWriteArrayList for thread safety during iteration,
but addAll followed by sort is not atomic. Concurrent calls to register could
interleave, resulting in a partially sorted or inconsistent list. Consider
synchronizing the register method to ensure atomicity of the add-and-sort operation.

common/src/main/java/org/opensearch/sql/common/antlr/suggestion/SyntaxErrorSuggestionRegistry.java [25-28]

-public static void register(SyntaxErrorSuggestionProvider... providers) {
+public static synchronized void register(SyntaxErrorSuggestionProvider... providers) {
   PROVIDERS.addAll(Arrays.asList(providers));
   PROVIDERS.sort(Comparator.comparingInt(SyntaxErrorSuggestionProvider::getPriority));
 }
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly identifies a potential race condition between addAll and sort in the register method. However, this is a static registry that is only populated at startup (via the static block), so concurrent registration is unlikely in practice, making this a low-priority concern.

Low
Remove unreachable dead code in catch clause

Since ErrorReport is now thrown instead of SyntaxCheckException by the parser,
catching SyntaxCheckException here is dead code (it will never be thrown by the
parser anymore). If SyntaxCheckException can still be thrown from other code paths,
this should be documented; otherwise it should be removed to avoid confusion.

api/src/main/java/org/opensearch/sql/api/UnifiedQueryPlanner.java [65-69]

-} catch (SyntaxCheckException | UnsupportedOperationException | ErrorReport e) {
+} catch (UnsupportedOperationException | ErrorReport e) {
   throw e;
 } catch (Exception e) {
   throw new IllegalStateException("Failed to plan query", e);
 }
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly notes that SyntaxCheckException may be dead code in the catch clause since the parser now throws ErrorReport. However, SyntaxCheckException could still be thrown from other code paths, so removing it without full analysis could introduce regressions.

Low
Possible issue
Remove unreachable dead code condition

The fallback condition only checks if the ErrorReport's direct cause is a
SyntaxCheckException, but since ErrorReport now wraps SyntaxCheckException as the
underlying cause, the check e instanceof SyntaxCheckException will never be true
anymore (as the parser now throws ErrorReport instead). The first condition e
instanceof SyntaxCheckException is now dead code and the logic should rely solely on
the ErrorReport check. Consider simplifying to just check for ErrorReport wrapping a
SyntaxCheckException, or verify whether SyntaxCheckException can still be thrown
directly elsewhere.

legacy/src/main/java/org/opensearch/sql/legacy/plugin/RestSQLQueryAction.java [135-139]

-if (e instanceof SyntaxCheckException
-    || e instanceof UnsupportedCursorRequestException
+if (e instanceof UnsupportedCursorRequestException
     || (e instanceof ErrorReport
         && ((ErrorReport) e).getCause() instanceof SyntaxCheckException)) {
   fallBackHandler.accept(channel, e);
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies that e instanceof SyntaxCheckException may be dead code since the parser now throws ErrorReport wrapping SyntaxCheckException. However, SyntaxCheckException might still be thrown from other code paths not changed in this PR, so this is a moderate concern rather than a critical bug.

Low
Suggestions up to commit c88bd44
CategorySuggestion                                                                                                                                    Impact
General
Remove duplicated fallback suggestion logic

The ExpectedTokensSuggestionProvider is already registered in
SyntaxErrorSuggestionRegistry and handles the fallback expected-tokens logic. The
else if (e != null) branch in SyntaxAnalysisErrorListener.syntaxError duplicates
this logic, which means the fallback suggestion can be generated twice or
inconsistently. Remove the duplicate else if branch and rely solely on the registry
(which includes ExpectedTokensSuggestionProvider as a fallback).

common/src/main/java/org/opensearch/sql/common/antlr/SyntaxAnalysisErrorListener.java [74-97]

 // Use the suggestion registry to find pattern-based suggestions
 SyntaxErrorContext context =
     new SyntaxErrorContext(recognizer, offendingToken, tokens, query, e);
 List<String> customSuggestions = SyntaxErrorSuggestionRegistry.findSuggestions(context);
 
 if (!customSuggestions.isEmpty()) {
-  // Use the first suggestion from the registry
   reportBuilder.suggestion(customSuggestions.get(0));
-} else if (e != null) {
-  // Fall back to expected tokens as suggestion if no pattern matches
-  IntervalSet possibleContinuations = e.getExpectedTokens();
-  List<String> suggestions = topSuggestions(recognizer, possibleContinuations);
-  if (!suggestions.isEmpty()) {
-    String suggestionText =
-        possibleContinuations.size() > SUGGESTION_TRUNCATION_THRESHOLD
-            ? String.format(
-                "Expected one of %d possible tokens. Examples: %s",
-                possibleContinuations.size(), String.join(", ", suggestions))
-            : "Expected tokens: " + String.join(", ", suggestions);
-    reportBuilder.suggestion(suggestionText);
-  }
 }
Suggestion importance[1-10]: 6

__

Why: The ExpectedTokensSuggestionProvider is already registered in the registry and handles the fallback logic. The else if (e != null) branch duplicates this, potentially causing inconsistent behavior. However, the duplicate logic uses topSuggestions() which may differ slightly from ExpectedTokensSuggestionProvider, so this needs careful verification.

Low
Prevent global registry pollution between tests

The test registers stub providers into the global static
SyntaxErrorSuggestionRegistry, which is shared across all tests. This pollutes the
registry for other tests running in the same JVM, potentially causing flaky
failures. The test should use a local registry instance or clean up after itself.

sql/src/test/java/org/opensearch/sql/sql/antlr/suggestion/SyntaxErrorSuggestionRegistryTest.java [23-29]

 SyntaxErrorSuggestionRegistry.register(highPrioritySuggestion, lowPrioritySuggestion);
+try {
+  SyntaxErrorContext ctx = ContextFactory.contextFor("SELECT FROM t");
+  List<String> suggestions = SyntaxErrorSuggestionRegistry.findSuggestions(ctx);
+  assertEquals("low-wins", suggestions.get(0));
+} finally {
+  // Clean up registered stubs to avoid polluting other tests
+  // (requires exposing an unregister/reset method on the registry)
+  SyntaxErrorSuggestionRegistry.reset();
+}
 
-// Provide a context that both will match (both stubs ignore the context).
-SyntaxErrorContext ctx = ContextFactory.contextFor("SELECT FROM t");
-List<String> suggestions = SyntaxErrorSuggestionRegistry.findSuggestions(ctx);
-
-assertEquals("low-wins", suggestions.get(0));
-
Suggestion importance[1-10]: 4

__

Why: The test registers stub providers into the global static registry without cleanup, which can pollute other tests. However, the improved_code references a SyntaxErrorSuggestionRegistry.reset() method that doesn't exist in the PR, making the suggestion incomplete as-is.

Low
Include actual token in suggestion message

The suggestion hardcodes the example hello+world regardless of the actual
offending token or table name in the query. The suggestion should include the actual
offending context (e.g., the token text or surrounding identifier) to be more
actionable and accurate for users.

common/src/main/java/org/opensearch/sql/common/antlr/suggestion/UnquotedTableNameSuggestionProvider.java [20-21]

 return List.of(
-    "Quote table names containing special characters with backticks, e.g. `hello+world`");
+    "Quote table names containing special characters with backticks, e.g. `table" + offending + "name`");
Suggestion importance[1-10]: 3

__

Why: The hardcoded example `hello+world` is not contextual. However, the improved_code is syntactically incorrect (string concatenation with an undefined offending variable) and doesn't accurately reflect a valid implementation, making this suggestion unreliable.

Low
Possible issue
Fix race condition in provider registration

CopyOnWriteArrayList.sort() is not atomic with respect to addAll(), so concurrent
calls to register() can result in a partially-sorted or inconsistent list. Since
register() is called both from the static initializer and from tests, this is a real
race condition. Use a synchronized block or replace with a thread-safe sorted
structure.

common/src/main/java/org/opensearch/sql/common/antlr/suggestion/SyntaxErrorSuggestionRegistry.java [25-28]

-public static void register(SyntaxErrorSuggestionProvider... providers) {
+public static synchronized void register(SyntaxErrorSuggestionProvider... providers) {
   PROVIDERS.addAll(Arrays.asList(providers));
   PROVIDERS.sort(Comparator.comparingInt(SyntaxErrorSuggestionProvider::getPriority));
 }
Suggestion importance[1-10]: 5

__

Why: The addAll and sort operations on CopyOnWriteArrayList are not atomic, creating a potential race condition during concurrent register() calls. Adding synchronized is a valid fix, though in practice this is only called from static initializers and tests, limiting real-world impact.

Low
Suggestions up to commit 1f9b08b
CategorySuggestion                                                                                                                                    Impact
Possible issue
Guard against negative stop index for EOF tokens

Token.getStopIndex() returns -1 for EOF tokens, so end would be 0 and
query.substring(0) would return the entire query instead of an empty string. Add a
guard for the EOF case (token type Token.EOF or stopIndex < 0).

common/src/main/java/org/opensearch/sql/common/antlr/suggestion/SyntaxErrorContext.java [31-35]

 public String getRemainingQuery() {
   if (offendingToken == null) return "";
-  int end = offendingToken.getStopIndex() + 1;
+  int stopIndex = offendingToken.getStopIndex();
+  if (stopIndex < 0) return ""; // EOF token
+  int end = stopIndex + 1;
   return end >= query.length() ? "" : query.substring(end);
 }
Suggestion importance[1-10]: 7

__

Why: Token.getStopIndex() returns -1 for EOF tokens, which would cause getRemainingQuery() to return the entire query instead of an empty string. This is a real edge case bug that could produce incorrect suggestions for queries ending at EOF.

Medium
Fix non-atomic add-then-sort race condition

CopyOnWriteArrayList.sort() replaces the list's contents atomically, but addAll
followed by sort is not atomic — a concurrent findSuggestions call between the two
operations could observe a partially-updated, unsorted list. Use a synchronized
block or replace with a lock-based list to make the two operations atomic.

common/src/main/java/org/opensearch/sql/common/antlr/suggestion/SyntaxErrorSuggestionRegistry.java [25-28]

-public static void register(SyntaxErrorSuggestionProvider... providers) {
+public static synchronized void register(SyntaxErrorSuggestionProvider... providers) {
   PROVIDERS.addAll(Arrays.asList(providers));
   PROVIDERS.sort(Comparator.comparingInt(SyntaxErrorSuggestionProvider::getPriority));
 }
Suggestion importance[1-10]: 5

__

Why: The race condition between addAll and sort is a valid concern for concurrent usage, but in practice register is called only during static initialization and test setup, making this a low-risk issue. Adding synchronized is a reasonable defensive improvement.

Low
General
Remove duplicated fallback suggestion logic

The ExpectedTokensSuggestionProvider is already registered in
SyntaxErrorSuggestionRegistry as a fallback (with Integer.MAX_VALUE priority), so
the else if (e != null) branch in SyntaxAnalysisErrorListener duplicates that logic.
This means the expected-tokens suggestion can be generated twice or the
registry-based one is always shadowed. Remove the duplicate else if branch and rely
solely on the registry.

common/src/main/java/org/opensearch/sql/common/antlr/SyntaxAnalysisErrorListener.java [77-95]

 List<String> customSuggestions = SyntaxErrorSuggestionRegistry.findSuggestions(context);
-
 if (!customSuggestions.isEmpty()) {
-  // Use the first suggestion from the registry
   reportBuilder.suggestion(customSuggestions.get(0));
-} else if (e != null) {
-  // Fall back to expected tokens as suggestion if no pattern matches
-  IntervalSet possibleContinuations = e.getExpectedTokens();
-  List<String> suggestions = topSuggestions(recognizer, possibleContinuations);
-  ...
 }
 
+throw reportBuilder.build();
+
Suggestion importance[1-10]: 6

__

Why: The else if (e != null) branch in SyntaxAnalysisErrorListener duplicates the logic already handled by ExpectedTokensSuggestionProvider registered in the registry, creating redundant code and potential inconsistency. Removing the duplicate branch would simplify the code and rely on the registry as the single source of suggestions.

Low
Prevent static registry pollution between tests

SyntaxErrorSuggestionRegistry.PROVIDERS is a static field shared across all tests.
Registering stub providers here permanently pollutes the registry for all subsequent
tests in the same JVM run, potentially causing flaky failures in other tests that
rely on the registry's default state. The test should save and restore the registry
state, or the registry should expose a reset/unregister mechanism for testing.

sql/src/test/java/org/opensearch/sql/sql/antlr/suggestion/SyntaxErrorSuggestionRegistryTest.java [20-29]

+// Consider adding a package-private reset method to SyntaxErrorSuggestionRegistry for tests:
+// static void resetToDefaults() { ... }
+// Then call it in @BeforeEach / @AfterEach to isolate test state.
 void lowerPriorityProviderWinsOverHigherPriorityProvider() {
   StubProvider lowPrioritySuggestion = new StubProvider("low-wins", 1);
   StubProvider highPrioritySuggestion = new StubProvider("high-loses", Integer.MAX_VALUE - 1);
   SyntaxErrorSuggestionRegistry.register(highPrioritySuggestion, lowPrioritySuggestion);
+  // ... rest of test ...
+  SyntaxErrorSuggestionRegistry.resetToDefaults(); // restore after test
+}
Suggestion importance[1-10]: 5

__

Why: Registering stub providers into the static SyntaxErrorSuggestionRegistry without cleanup can cause test pollution across the test suite. However, the improved_code only adds comments rather than actual implementation, making the suggestion incomplete.

Low
Suggestions up to commit c6c83cc
CategorySuggestion                                                                                                                                    Impact
General
Remove duplicate fallback suggestion logic

The ExpectedTokensSuggestionProvider is already registered in
SyntaxErrorSuggestionRegistry as a fallback (with Integer.MAX_VALUE priority), so
the else if (e != null) branch in SyntaxAnalysisErrorListener duplicates that logic.
This means the expected-tokens suggestion can be generated twice or the registry's
fallback is bypassed. Remove the duplicate else if block and rely solely on the
registry for all suggestion generation.

common/src/main/java/org/opensearch/sql/common/antlr/SyntaxAnalysisErrorListener.java [79-95]

 // Use the suggestion registry to find pattern-based suggestions
 SyntaxErrorContext context =
     new SyntaxErrorContext(recognizer, offendingToken, tokens, query, e);
 List<String> customSuggestions = SyntaxErrorSuggestionRegistry.findSuggestions(context);
 
 if (!customSuggestions.isEmpty()) {
-  // Use the first suggestion from the registry
   reportBuilder.suggestion(customSuggestions.get(0));
-} else if (e != null) {
-  // Fall back to expected tokens as suggestion if no pattern matches
-  IntervalSet possibleContinuations = e.getExpectedTokens();
-  List<String> suggestions = topSuggestions(recognizer, possibleContinuations);
-  ...
 }
Suggestion importance[1-10]: 7

__

Why: The ExpectedTokensSuggestionProvider is already registered in the registry with Integer.MAX_VALUE priority as a fallback, so the else if (e != null) block in SyntaxAnalysisErrorListener duplicates that logic. This could lead to inconsistent behavior where the registry's fallback is bypassed. Removing the duplicate block would simplify the code and ensure all suggestion logic flows through the registry.

Medium
Verify and clarify redundant exception type check

Since ErrorReport is now thrown instead of SyntaxCheckException from the parser (as
shown in the PR), the SyntaxCheckException check here may be redundant for the
syntax error case. However, more critically, if ErrorReport wraps a
SyntaxCheckException as its cause, only ErrorReport needs to be checked. Verify that
SyntaxCheckException can still be thrown independently in other code paths; if not,
remove it to avoid dead code and confusion.

legacy/src/main/java/org/opensearch/sql/legacy/plugin/RestSQLQueryAction.java [135-138]

-if (e instanceof SyntaxCheckException
-    || e instanceof ErrorReport
+if (e instanceof ErrorReport
+    || e instanceof SyntaxCheckException
     || e instanceof UnsupportedCursorRequestException) {
   fallBackHandler.accept(channel, e);
 }
Suggestion importance[1-10]: 2

__

Why: The suggestion asks to verify whether SyntaxCheckException is still independently thrown, but the improved_code is essentially identical to the existing_code (just reordered), making this a low-value observation. The PR intentionally keeps SyntaxCheckException for backward compatibility with other code paths.

Low
Possible issue
Prevent global state pollution between tests

This test mutates the global static SyntaxErrorSuggestionRegistry.PROVIDERS list by
registering stub providers, which will persist across tests and can cause
interference with other tests (e.g., UnquotedTableNameSuggestionProviderTest or
ExpectedTokensSuggestionProviderTest). The test should either reset the registry
after the test or use a local registry instance to avoid polluting the shared state.

sql/src/test/java/org/opensearch/sql/sql/antlr/suggestion/SyntaxErrorSuggestionRegistryTest.java [20-29]

 void lowerPriorityProviderWinsOverHigherPriorityProvider() {
   StubProvider lowPrioritySuggestion = new StubProvider("low-wins", 1);
   StubProvider highPrioritySuggestion = new StubProvider("high-loses", Integer.MAX_VALUE - 1);
   SyntaxErrorSuggestionRegistry.register(highPrioritySuggestion, lowPrioritySuggestion);
+  try {
+    SyntaxErrorContext ctx = ContextFactory.contextFor("SELECT FROM t");
+    List<String> suggestions = SyntaxErrorSuggestionRegistry.findSuggestions(ctx);
+    assertEquals("low-wins", suggestions.get(0));
+  } finally {
+    // Reset registry to avoid polluting other tests
+    SyntaxErrorSuggestionRegistry.reset();
+  }
+}
Suggestion importance[1-10]: 6

__

Why: The test registers stub providers into the global static SyntaxErrorSuggestionRegistry.PROVIDERS list without cleanup, which can pollute state for other tests. However, the improved_code references a SyntaxErrorSuggestionRegistry.reset() method that doesn't exist in the PR, making the suggestion partially invalid as-is.

Low
Fix thread-safety of provider registration and sorting

CopyOnWriteArrayList.sort() is not atomic with respect to addAll(), so concurrent
calls to register() can produce a partially-sorted or inconsistent list. Since the
static initializer already registers the default providers, consider making
register() synchronized or using a different thread-safe approach to ensure the sort
is always consistent after concurrent modifications.

common/src/main/java/org/opensearch/sql/common/antlr/suggestion/SyntaxErrorSuggestionRegistry.java [25-28]

-public static void register(SyntaxErrorSuggestionProvider... providers) {
+public static synchronized void register(SyntaxErrorSuggestionProvider... providers) {
   PROVIDERS.addAll(Arrays.asList(providers));
   PROVIDERS.sort(Comparator.comparingInt(SyntaxErrorSuggestionProvider::getPriority));
 }
Suggestion importance[1-10]: 5

__

Why: The addAll and sort operations on CopyOnWriteArrayList are not atomic, so concurrent register() calls could produce an inconsistently sorted list. Adding synchronized would fix this race condition, though in practice register() is typically only called during initialization.

Low

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 14475e5

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit c9ad1f2

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit e473f7f

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit f7cbe56

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 575202d

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 8e8ab9e

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 71ad3bb

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit bd46b1e

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 3cd01e4

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 24, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 79f9b23.

PathLineSeverityDescription
scripts/docs_exporter/__pycache__/export_to_docs_website.cpython-314.pyc1mediumBinary compiled Python bytecode file committed to the repository. Its contents cannot be inspected in the diff. Python 3.14 is pre-release, making this unusual. Committing .pyc files is not standard practice and the file could contain arbitrary obfuscated logic not visible in code review.
test_cluster_output.java1lowAd-hoc scratch test file committed to the repository root (not under src/ or test/). Likely an accidentally committed developer artifact. No malicious code found in its content, but its presence is anomalous.
test_error_format.java1lowAd-hoc scratch test file committed to the repository root (not under src/ or test/). Likely an accidentally committed developer artifact. No malicious code found in its content, but its presence is anomalous.

The table above displays the top 10 most important findings.

Total: 3 | Critical: 0 | High: 0 | Medium: 1 | Low: 2


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 078dc07

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 940310d

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 51952fb

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 283ebd9

- Add back ErrorReport handling in SQLQueryUtils.isFlintExtensionQuery()
- Fix API test expectations to use ErrorReport instead of SyntaxCheckException

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 79f9b23

@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit c6c83cc

Remove debug/temporary files that were accidentally committed:
- cluster_demo_data.json
- cluster_demo_with_explanations.md
- playground_urls.md
- test_cluster_output.java
- test_error_format.java
- export_to_docs_website.cpython-314.pyc

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@ritvibhatt ritvibhatt force-pushed the syntax-exception-error-message branch from c6c83cc to 1f9b08b Compare April 28, 2026 16:24
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 1f9b08b

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit c88bd44

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 2f16563

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit f02c2a9

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Persistent review updated to latest commit 24d398c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant