-
Notifications
You must be signed in to change notification settings - Fork 130
Operator Updates #4725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Operator Updates #4725
Conversation
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Benchmarks: Random AccessSummary
|
Benchmarks: TPC-H SF=1 on NVMESummary
Detailed Results Table
|
Benchmarks: TPC-H SF=1 on S3Summary
Detailed Results Table
|
Benchmarks: CompressionSummary
Detailed Results Table
|
Benchmarks: TPC-H SF=10 on NVMESummary
Detailed Results Table
|
Benchmarks: Statistical and Population GeneticsSummary
Detailed Results Table
|
Benchmarks: TPC-H SF=10 on S3Summary
Detailed Results Table
|
|
@claude review |
This comment was marked as resolved.
This comment was marked as resolved.
Benchmarks: Clickbench on NVMESummary
Detailed Results Table
|
joseph-isaacs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤞
Benchmarks: TPC-DS SF=1 on NVMESummary
Detailed Results Table
|
|
I'm finding it a bit hard to grok the API change (Edit: after reading a bit more I think I understand a bit better, but I'm going to write this anyways. Edit Edit: And since this is merged this is more of a discussion question for later). I may have missed this when reading over everything, but what I think I expected to see at a high level was:
So I think I can see all of these things here scattered across the codebase, but I'm wondering if it is possible to structure the code in such a way that the 3 things above are essentially in 1 place. Yesterday I said that it makes sense to allow the plan to be "baked in" to our For example, I'm thinking about if I wanted to link up Vortex to some future query optimizer. I would want to be able to easily translate a logical decompression "plan" from Vortex into whatever IR that optimizer takes as input (could be substrait, for all we know). I think it would be ideal if there is a clear and obvious way to pass these operator trees: is that the case here? (Just to be clear I don't fully understand this change so maybe that is the case already...) And then on the other side, is it possible for that optimizer to then spit out a physical plan where we take everything in that plan that is below a join and a shuffle (pipelines) and translate that into our own execution plan. Is there a clear and obvious way we would do that? TLDR; would it be beneficial to have a clear IR for both the operator tree and the execution plan that would be distinct from each other? My understanding (which could be very wrong) is that this changes somewhat conflates the two (because both concepts seem to be methods dispatched via trait implementations), and I'm not sure if that is the right abstraction. |
This PR changes the API a little for operators as well as creates an executor that performs CSE and pipeline conversion.
It breaks many of the old operator tests and benchmarks, many of which are commented out. But to avoid long-lived feature branch @joseph-isaacs and I will continue to merge onto develop.
Execution via operators is gated on a runtime environment variable for local development until we get this prototype fleshed out and performant!