-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add baseline execution stats to WindowAggExec and UnionExec, and fixup CoalescePartitionsExec
#1018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixup tests and implelementation
| ); | ||
| assert_metrics!( | ||
| &formatted, | ||
| "LocalLimitExec: limit=3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by making the query more complicated, it has also introduced a LocalLimitExec for testing 🎉
| if partition < input.output_partitioning().partition_count() { | ||
| return input.execute(partition).await; | ||
| let stream = input.execute(partition).await?; | ||
| drop(timer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this drop needed if the next line contains a return?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The drop is needed to satisify the borrow checker: let timer = baseline_metrics.elapsed_compute().timer(); has borrowed from baseline_metrics and rust won't allow baseline metrics to be given to ObserverdStream::new while borrowed.
An alternative is to clone the elapsed compute
let elapsed_compute = baseline_metrics.elapsed_compute().clone();
let timer = elapsed_compute.timer();Which is done in several other parts of this PR. 🤔
Perhaps that would be better to keep the code consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in c0b656c
Dandandan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, looking forward to using those!
|
I think something unrelated to this PR is causing the tests to fail. I will look into it later today or tomorrow if no one else gets around to it |
Looks to me just the test in |
|
Thanks 👍 |
Which issue does this PR close?
Finally 🎉 closes #866 (following the same model as #1004).
There are still some operators like Parquet, CSV, Avro, and Json sources that are not instrumented, but I don't have time to devote to intrumenting them now, #1019 tracks that work
Rationale for this change
We want basic understanding of where a plan's time is spent and in what operators. See #866 for more details
What changes are included in this PR?
WindowAggExecandUnionExec, using the API from Add BaselineMetrics, Timestamp metrics, add forCoalescePartitionsExec, rename output_time -> elapsed_compute #909CoalescePartitionsExecso it reportselapsed_computeAre there any user-facing changes?
More fields in
EXPLAIN ANALYZEare now filled outExample of how explain analyze is looking (dense but packed with good info). I find it quite cool that DataFusion can even plan and execute such queries.