-
Notifications
You must be signed in to change notification settings - Fork 69
【OSCP Dev】SCDB 支持 explain statement Feat:support Explain statement for SCDB(#478) #675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
All contributors have signed the CLA ✍️ ✅ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @zhuyanhuazhuyanhua, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
I've implemented support for the EXPLAIN SQL statement within SCDB. This feature allows users to inspect the execution plan of their queries, providing insights into how the system processes their requests. The changes involve integrating EXPLAIN into the query execution flow, from parsing to execution, and returning the execution graph information.
Highlights
- EXPLAIN Statement Support: I've added the capability for SCDB to process and respond to EXPLAIN SQL statements, enabling users to understand query execution plans.
- Executor Integration: The executorBuilder now recognizes core.Explain plans and dispatches them to a new buildExplain method, ensuring EXPLAIN queries are correctly routed within the execution engine.
- ExplainExec Implementation: A new ExplainExec executor has been introduced. This executor is responsible for compiling the target query plan using the interpreter and extracting the execution graph in DOT format.
- Server-Side Query Handling: The submitAndGet handler in the SCDB server now identifies EXPLAIN queries at an early stage and directs them to a dedicated handleExplainQuery function, which then uses the interpreter to generate and return the execution plan.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
I have read the CLA Document and I hereby sign the CLA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for the EXPLAIN
statement. The implementation introduces a special handling path for EXPLAIN
queries in submit_and_get_handler.go
, which compiles the query and returns the execution graph without going through the full executor flow. While this approach works, I've found a critical issue with a new but apparently unused ExplainExec
executor that is broken and will cause a panic if ever executed. I've also suggested a minor refactoring to reduce code duplication in the server handler. My main recommendation is to remove the dead and broken executor code to improve the codebase's health.
pkg/scdb/executor/explain.go
Outdated
func (e *ExplainExec) Next(ctx context.Context, req *chunk.Chunk) (err error) { | ||
if e.done { | ||
return nil | ||
} | ||
e.done = true | ||
interpreterInstance := interpreter.NewInterpreter() | ||
|
||
// 调用解释器编译,这里需要根据 Compile 方法的实际参数进行调整 | ||
// 假设 Compile 方法需要的参数可以从上下文或者其他地方获取,以下为示例 | ||
// 实际使用时需要根据真实情况补充完整参数 | ||
compiledPlan, err := interpreterInstance.Compile(ctx, nil) // 这里的参数需要根据实际情况修改 | ||
if err != nil { | ||
return err | ||
} | ||
|
||
if compiledPlan.Explain == nil || compiledPlan.Explain.ExeGraphDot == "" { | ||
return nil | ||
} | ||
dotInfo := compiledPlan.Explain.ExeGraphDot | ||
|
||
req.Reset() | ||
|
||
if req.NumCols() == 0 { | ||
return fmt.Errorf("chunk has no columns") | ||
} | ||
|
||
col := req.Column(0) | ||
col.AppendString(dotInfo) | ||
|
||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Next
method for ExplainExec
is currently implemented in a way that will cause a panic. The call interpreterInstance.Compile(ctx, nil)
passes a nil
request, which will lead to a nil pointer dereference inside the Compile
function. The comment // 这里的参数需要根据实际情况修改
also indicates this is incomplete.
Furthermore, due to the special handling of EXPLAIN
queries in pkg/scdb/server/submit_and_get_handler.go
, this ExplainExec
executor appears to be dead code. The query is intercepted and handled by handleExplainQuery
before it would ever reach the executor building stage for this type of plan.
This broken and unused code should be removed to avoid confusion and prevent potential bugs if the query handling logic changes in the future. The related changes in pkg/scdb/executor/builder.go
that add the build logic for Explain
should also be removed along with this file.
func (app *App) handleExplainQuery(ctx context.Context, s *session, req *scql.SCDBQueryRequest) *scql.SCDBQueryResultResponse { | ||
compileReq, err := app.buildCompileRequest(ctx, s) | ||
if err != nil { | ||
return newErrorSCDBQueryResultResponse(scql.Code_INTERNAL, err.Error()) | ||
} | ||
|
||
intrpr := interpreter.NewInterpreter() | ||
compiledPlan, err := intrpr.Compile(ctx, compileReq) | ||
if err != nil { | ||
return newErrorSCDBQueryResultResponse(scql.Code_INTERNAL, err.Error()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for building the compile request and compiling the plan in this function is very similar to the beginning of the runDQL
function. To avoid code duplication and improve maintainability, consider extracting this common logic into a separate helper method. This new method could take ctx
and session
as arguments and return the compiledPlan
and an error
. Both handleExplainQuery
and runDQL
could then call this helper method.
For example:
func (app *App) compileQuery(ctx context.Context, s *session) (*pb.CompiledPlan, error) {
compileReq, err := app.buildCompileRequest(ctx, s)
if err != nil {
return nil, err
}
intrpr := interpreter.NewInterpreter()
return intrpr.Compile(ctx, compileReq)
}
Then handleExplainQuery
would become:
func (app *App) handleExplainQuery(ctx context.Context, s *session, req *scql.SCDBQueryRequest) *scql.SCDBQueryResultResponse {
compiledPlan, err := app.compileQuery(ctx, s)
if err != nil {
return newErrorSCDBQueryResultResponse(scql.Code_INTERNAL, err.Error())
}
// ...
}
recheck |
if strings.HasPrefix(strings.ToUpper(req.Query), "EXPLAIN") { | ||
return app.handleExplainQuery(ctx, session, req) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: It is better to handle the EXPLAIN statement within the runSQL function, that will provide a more consistent approach.
pkg/scdb/server/submit_handler.go
Outdated
} | ||
}() | ||
|
||
if strings.HasPrefix(strings.ToUpper(s.request.Query), "EXPLAIN") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
接之前的 comment,通过 runSQL 执行,并不是表示把实现挪到这里,而是最终通过 runSQL 最后一行的 scdbexecutor.Run() 执行,也就是会调用到 pkg/scdb/executor/explain.go 里的 Next 函数执行。这样非 DQL 都通过 scdb executor 执行,实现和其他非 DQL 执行的统一。
方案1:在pkg/scdb/executor中实现 与show语句保持一致的架构 需要在执行器内部调用interpreter进行编译 |
方案2:作为特殊的DQL处理 在runDQL路径中添加explain的特殊处理 |
请问有相关测试代码吗 |
测试函数 package server
import (
"testing"
"github.com/stretchr/testify/assert"
)
func TestIsExplainQuery(t *testing.T) {
tests := []struct {
name string
query string
expected bool
hasError bool
}{
{
name: "Valid EXPLAIN query",
query: "EXPLAIN SELECT * FROM users",
expected: true,
hasError: false,
},
{
name: "Valid EXPLAIN query in lowercase",
query: "explain select * from users",
expected: true,
hasError: false,
},
{
name: "Valid EXPLAIN query with whitespace",
query: " EXPLAIN SELECT * FROM users ",
expected: true,
hasError: false,
},
{
name: "Non-EXPLAIN query",
query: "SELECT * FROM users",
expected: false,
hasError: false,
},
{
name: "INSERT query",
query: "INSERT INTO users (name) VALUES ('alice')",
expected: false,
hasError: false,
},
{
name: "Empty query",
query: "",
expected: true,
hasError: true,
},
{
name: "EXPLAIN with complex query",
query: "EXPLAIN SELECT u.name, p.title FROM users u JOIN posts p ON u.id = p.user_id WHERE u.age > 18",
expected: true,
hasError: false,
},
{
name: "EXPLAIN ANALYZE query",
query: "EXPLAIN ANALYZE SELECT * FROM users",
expected: true,
hasError: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result, err := isExplainQuery(tt.query)
if tt.hasError {
assert.Error(t, err)
} else {
assert.NoError(t, err)
assert.Equal(t, tt.expected, result)
}
})
}
} 结果 === RUN TestIsExplainQuery
=== RUN TestIsExplainQuery/Valid_EXPLAIN_query
--- PASS: TestIsExplainQuery/Valid_EXPLAIN_query (0.00s)
=== RUN TestIsExplainQuery/Valid_EXPLAIN_query_in_lowercase
--- PASS: TestIsExplainQuery/Valid_EXPLAIN_query_in_lowercase (0.00s)
=== RUN TestIsExplainQuery/Valid_EXPLAIN_query_with_whitespace
--- PASS: TestIsExplainQuery/Valid_EXPLAIN_query_with_whitespace (0.00s)
=== RUN TestIsExplainQuery/Non-EXPLAIN_query
--- PASS: TestIsExplainQuery/Non-EXPLAIN_query (0.00s)
=== RUN TestIsExplainQuery/INSERT_query
--- PASS: TestIsExplainQuery/INSERT_query (0.00s)
=== RUN TestIsExplainQuery/Empty_query
--- PASS: TestIsExplainQuery/Empty_query (0.00s)
=== RUN TestIsExplainQuery/EXPLAIN_with_complex_query
--- PASS: TestIsExplainQuery/EXPLAIN_with_complex_query (0.00s)
=== RUN TestIsExplainQuery/EXPLAIN_ANALYZE_query
--- PASS: TestIsExplainQuery/EXPLAIN_ANALYZE_query (0.00s)
--- PASS: TestIsExplainQuery (0.00s)
PASS
ok github.com/secretflow/scql/pkg/scdb/server 0.795s |
直接把测试代码也提交上来吧,相关测试也是本次 OSCP 任务要求的一部分~ |
请补充测试代码 |
"mysql_query": "select ta.plain_int_0, percent_rank() over(partition by ta.plain_int_0 order by ta.plain_float_0) as num from alice.tbl_0 as ta;" | ||
}, | ||
{ | ||
"name": "explain percent_rank", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
直接加在 run sql 的回测用例里不合适。
请增加与 TestSCDBWithNormalCCL 同级的测试函数。
No description provided.