-
Notifications
You must be signed in to change notification settings - Fork 995
[KYUUBI #7323] Support timeout at PlanOnlyMode #7324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -17,6 +17,8 @@ | |
|
|
||
| package org.apache.kyuubi.engine.spark | ||
|
|
||
| import java.sql.SQLTimeoutException | ||
|
|
||
| import org.apache.kyuubi.WithKyuubiServer | ||
| import org.apache.kyuubi.config.KyuubiConf | ||
| import org.apache.kyuubi.config.KyuubiConf._ | ||
|
|
@@ -149,5 +151,38 @@ class SparkSqlEngineSuite extends WithKyuubiServer with HiveJDBCTestHelper { | |
| } | ||
| } | ||
|
|
||
| test("KYUUBI-7323: Support timeout at PlanOnlyMode") { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Kyuubi uses GitHub Issues instead of JIRA, use Also, we don't require creating an issue before sending a PR, GitHub Issues and PRs share the sequence number, and the link can jump to each other, so
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, please forgive me for mistakenly copying the template from another test method above. And I removed this test case here, since I find that the added test method actually fails to verify whether the engine has truly cancel operation.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Has added the solution at issue. However, I don't fully understand why interrupting the thread can stop the analysis. There might be thread state detection logic during plan analysis, but I haven't found the specific code. |
||
| val tableName = "t_timeout" | ||
| val sessionConfMap = Map("kyuubi.operation.plan.only.mode=analyze" -> "analyze") | ||
| withSessionConf(sessionConfMap)(Map.empty)(Map.empty) { | ||
|
|
||
| withJdbcStatement(tableName) { statement => | ||
| statement.setQueryTimeout(2) | ||
| val layers = 30 | ||
|
|
||
| val cteBuilder = new StringBuilder() | ||
| cteBuilder.append("WITH p AS (SELECT dt, tid, count(1) as pv, DENSE_RANK() " + | ||
| "over(order by dt) as dt_rank FROM t GROUP BY 1, 2), ") | ||
| cteBuilder.append("test_d1 AS (SELECT dt, tid FROM (SELECT dt, tid, row_number() " + | ||
| "over(order by pv desc) as r FROM p WHERE dt_rank=1) WHERE r<=100)") | ||
|
|
||
| for (i <- 2 to layers) { | ||
| cteBuilder.append(s", test_d$i AS ( " + | ||
| s"SELECT dt, tid FROM (SELECT p.dt, p.tid, row_number() over(order by pv desc) as r " + | ||
| s"FROM p LEFT JOIN test_d${i - 1} prev ON p.tid = prev.tid " + | ||
| s"WHERE p.dt_rank=$i AND prev.tid IS NULL) WHERE r<=100 " + | ||
| s"UNION ALL SELECT dt, tid FROM test_d${i - 1})") | ||
| } | ||
|
|
||
| val complexSql = cteBuilder.toString() + s" SELECT * FROM test_d$layers" | ||
|
|
||
| val e = intercept[SQLTimeoutException] { | ||
| statement.executeQuery(complexSql) | ||
| } | ||
| assert(e.getMessage.equals("Query timed out after 2 seconds")) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| override protected def jdbcUrl: String = getJdbcUrl | ||
| } | ||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this creates a new thread for each operation, it's too expensive. let's follow the idea of #7121 to create a global ThreadScheduledExecutor shared by all operations.
Q: why plan-only stmt was not covered by that? I suppose it should be, though I didn't take a deeper look yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tes, this is expensive. I will add a new method
at
OperationManagerto deal with this.And I had also assumed that plan-only statements could respond to timeouts, but I discovered that they cannot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our production environment, we discovered that in the current 1.9.x version, when setting a timeout in plan-only mode, although the session in the Kyuubi server has timed out and closed, the driver in the Spark engine continues to parse and execute the plan. This causes memory to remain unavailable.