Skip to content

Add PPL query cancellation via Discover cancel button#11593

Merged
mengweieric merged 5 commits intoopensearch-project:mainfrom
ahkcs:feature/ppl-query-cancellation
Mar 26, 2026
Merged

Add PPL query cancellation via Discover cancel button#11593
mengweieric merged 5 commits intoopensearch-project:mainfrom
ahkcs:feature/ppl-query-cancellation

Conversation

@ahkcs
Copy link
Copy Markdown
Contributor

@ahkcs ahkcs commented Mar 25, 2026

Summary

  • Generate a client-side UUID (queryId) for each synchronous PPL query and pass it through the full request pipeline to the OpenSearch /_plugins/_ppl endpoint
  • Add POST /api/enhancements/ppl/cancel server route that queries /_tasks?actions=*ppl*&detailed=true, matches tasks by queryId in the task description, and cancels all matching tasks via /_tasks/{taskId}/_cancel
  • Wire the existing Discover cancel button (abort flow) to call the new cancel route, enabling users to cancel long-running PPL queries from the UI
  • Each PPL search (data query and histogram aggregation) gets its own queryId, and both are cancelled independently when the user clicks cancel

Backend dependency

Depends on opensearch-project/sql#5254 (merged) which registers PPL queries as CancellableTask with the OpenSearch task framework.

Files changed

File Change
common/constants.ts Added PPL_CANCEL API path
common/types.ts Added queryId to EnhancedFetchContext
common/utils.ts Pass queryId in request body; call cancel route on abort
public/search/ppl_search_interceptor.ts Generate UUID per PPL search
server/routes/index.ts Accept queryId in route schema; register cancel route
server/routes/ppl_cancel.ts New cancel route implementation
server/routes/ppl_cancel.test.ts Unit tests (7 cases)
server/utils/facet.ts Forward queryId to OpenSearch PPL API

Changelog

  • skip

E2E verification logs

Cancel both data + histogram queries (double join)

OSD server log — both tasks found and cancelled:

server log [18:29:34.131] [info][plugins][queryEnhancements] PPL query cancelled: queryId=ccfec0d5-9113-4489-8bd5-77ff64ae3c7f, taskId=cwL2x0XeS-i-g-kClRr0ew:2594
server log [18:29:34.134] [info][plugins][queryEnhancements] PPL query cancelled: queryId=0f863cc9-aab3-4589-9e88-191268f9f375, taskId=cwL2x0XeS-i-g-kClRr0ew:2593

OpenSearch backend log — both queries throw TaskCancelledException from cooperative isCancelled() check in moveNext():

[2026-03-25T11:29:34,131][ERROR][o.o.s.p.r.RestPPLQueryAction] Error happened during query handling
Caused by: org.opensearch.core.tasks.TaskCancelledException: The task is cancelled.
        at ...OpenSearchIndexEnumerator.moveNext(OpenSearchIndexEnumerator.java:122)

[2026-03-25T11:29:34,261][ERROR][o.o.s.p.r.RestPPLQueryAction] Error happened during query handling
Caused by: org.opensearch.core.tasks.TaskCancelledException: The task is cancelled.
        at ...OpenSearchIndexEnumerator.moveNext(OpenSearchIndexEnumerator.java:122)

OSD receives error responses confirming cancellation:

{
  "error": {
    "reason": "Error occurred in OpenSearch engine: The task is cancelled.",
    "type": "TaskCancelledException"
  },
  "status": 500
}

Test plan

  • Unit tests pass (7 tests for cancel route, all existing tests unaffected)
  • E2E tested: submit long-running PPL query, click cancel, verify both data and histogram tasks are cancelled via server logs
  • Both queries cancelled simultaneously — backend confirms TaskCancelledException from cooperative isCancelled() polling in moveNext()
  • Verified compatibility with backend PR sql#5254 (queryId field name, task description format, task action wildcard)

Enable cancellation of in-flight PPL queries from the Discover UI cancel
button by leveraging the OpenSearch task framework.

- Generate a client-side UUID (queryId) for each synchronous PPL query
- Pass queryId through the full request pipeline to OpenSearch PPL API
- Add POST /api/enhancements/ppl/cancel server route that lists PPL tasks
  via _tasks API, matches by queryId in task description, and cancels all
  matching tasks
- Wire the existing Discover abort flow to call the cancel route on abort
- Cancel all matching tasks (data + histogram queries share the same queryId)

Depends on opensearch-project/sql#5254 for backend task registration.

Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

❌ Invalid Changelog Heading

The '## Changelog' heading in your PR description is either missing or malformed. Please make sure that your PR description includes a '## Changelog' heading with proper spelling, capitalization, spacing, and Markdown syntax.

@github-actions
Copy link
Copy Markdown
Contributor

❌ Invalid Changelog Heading

The '## Changelog' heading in your PR description is either missing or malformed. Please make sure that your PR description includes a '## Changelog' heading with proper spelling, capitalization, spacing, and Markdown syntax.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 25, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 0216993.

PathLineSeverityDescription
src/plugins/query_enhancements/server/routes/ppl_cancel.ts33mediumThe /_tasks endpoint is queried with no ownership check — any authenticated user can trigger this route and receive a list of ALL running PPL tasks across all nodes. Task descriptions may contain sensitive query content from other users, creating an information-disclosure risk. The feature's purpose (cancel your own query) does not require exposing other users' tasks.
src/plugins/query_enhancements/server/routes/ppl_cancel.ts46mediumNo authorization check ties a queryId to the requesting user. Any authenticated user who knows (or guesses) another user's queryId can cancel that user's running PPL query. The queryId is a UUID generated client-side and sent over the wire, making it theoretically observable to a network-privileged attacker or via log inspection.
src/plugins/query_enhancements/server/routes/ppl_cancel.ts44lowThe user-supplied queryId is interpolated directly into the targetPattern string used for substring matching against task descriptions (targetPattern = `queryId=${queryId}`). A value of empty string or a very short string (e.g. 'a') would match a large number of unrelated tasks and cancel them. The schema only enforces schema.string() with no minimum length or UUID format validation, so crafted inputs could cause unintended mass cancellation.

The table above displays the top 10 most important findings.

Total: 3 | Critical: 0 | High: 0 | Medium: 2 | Low: 1


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

❌ Invalid Changelog Heading

The '## Changelog' heading in your PR description is either missing or malformed. Please make sure that your PR description includes a '## Changelog' heading with proper spelling, capitalization, spacing, and Markdown syntax.

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

❌ Invalid Changelog Heading

The '## Changelog' heading in your PR description is either missing or malformed. Please make sure that your PR description includes a '## Changelog' heading with proper spelling, capitalization, spacing, and Markdown syntax.

@github-actions
Copy link
Copy Markdown
Contributor

❌ Invalid Prefix For Manual Changeset Creation

Invalid description prefix. Found "feat". Only "skip" entry option is permitted for manual commit of changeset files.

If you were trying to skip the changelog entry, please use the "skip" entry option in the ##Changelog section of your PR description.

@github-actions github-actions Bot added Skip-Changelog PRs that are too trivial to warrant a changelog or release notes entry and removed failed changeset labels Mar 25, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

} else if (context.body?.queryId) {
// Cancel synchronous PPL query via task cancellation
try {
await http.fetch({
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it better to not await this request? since frontend can't do much if cancel failed, it's job is to notify backend to cancel query, and it can move on. no need to block here in case PPL cancel API is stuck

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — updated to fire-and-forget. The cancel request now uses a .catch() chain instead of await, so the AbortError propagates immediately and the UI isn't blocked

method: 'GET',
path: '/_tasks',
querystring: {
actions: '*ppl*',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can it be PPL * or something more specific, like searching for queryId?

Copy link
Copy Markdown
Contributor Author

@ahkcs ahkcs Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _tasks API doesn't support filtering by description or queryId — it only supports taskId, actions and parentTaskId. The queryId is embedded in the task description string (e.g. PPL [queryId=]: ...), so we have to list PPL tasks with detailed=true and match client-side. This is the only approach available without a backend change to expose a cancel-by-queryId endpoint.

Remove await from the cancel HTTP call so the AbortError propagates
immediately. The cancel is a best-effort notification to the backend
and should not block the frontend from moving on.

Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

Use 'cluster:admin/opensearch/ppl' instead of '*ppl*' to avoid
matching unrelated tasks. The _tasks API does not support filtering
by description or queryId, so client-side matching remains necessary.

Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

The exact action name is less resilient to future changes. The wildcard
is sufficient since we still match by queryId in the task description.

Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 25, 2026

✅ All unit and integration tests passing

🔗 Workflow run · commit 95a6f6dce32f9b24171943395981a205e33d4c1a

@mengweieric
Copy link
Copy Markdown
Collaborator

mengweieric commented Mar 26, 2026

General question:

{
"error": {
"reason": "Error occurred in OpenSearch engine: The task is cancelled.",
"type": "TaskCancelledException"
},
"status": 500
}

Should the API return 4xx error code instead of 500? as this is not the backend error at all.

@ahkcs
Copy link
Copy Markdown
Contributor Author

ahkcs commented Mar 26, 2026

Should the API return 4xx error code instead of 500? as this is not the backend error at all.

Good call. I will open a follow-up PR on the backend side to change the classification of TaskCancelledException to 4xx

@ahkcs
Copy link
Copy Markdown
Contributor Author

ahkcs commented Mar 26, 2026

opensearch-project/sql#5273
Opened PR on the backend side to change the classification of TaskCancelledException
cc @mengweieric

Updated error message:

server    log   [19:02:02.468] [error][plugins][queryEnhancements] pplSearchStrategy: {
  "error": {
    "reason": "Query cancelled",
    "details": "The task is cancelled.",
    "type": "TaskCancelledException"
  },
  "status": 400

@ahkcs
Copy link
Copy Markdown
Contributor Author

ahkcs commented Mar 26, 2026

CI failure seems to be unrelated to our changes — this is an Explore app UI issue (autocomplete widget visibility)

Example:

https://github.com/opensearch-project/OpenSearch-Dashboards/actions/runs/23563713788/job/68611665458

@mengweieric mengweieric merged commit 494ecc7 into opensearch-project:main Mar 26, 2026
203 of 209 checks passed
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 0.00%. Comparing base (7782a80) to head (95a6f6d).
⚠️ Report is 22 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main   #11593       +/-   ##
===========================================
- Coverage   61.58%        0   -61.59%     
===========================================
  Files        4995        0     -4995     
  Lines      137542        0   -137542     
  Branches    23901        0    -23901     
===========================================
- Hits        84707        0    -84707     
+ Misses      46692        0    -46692     
+ Partials     6143        0     -6143     
Flag Coverage Δ
Linux_1 ?
Linux_2 ?
Linux_3 ?
Linux_4 ?
Linux_5 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ahkcs ahkcs deleted the feature/ppl-query-cancellation branch March 26, 2026 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

repeat-contributor Skip-Changelog PRs that are too trivial to warrant a changelog or release notes entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants