-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-34852: [C++][Go][Java][FlightRPC] Add support for ordered data #35178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
|
An integration test is added. |
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just a few minor comments.
docs/source/format/Flight.rst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe to be clear, leave in an "Otherwise, there is no ordering defined on endpoints or the data within"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK!
format/Flight.proto
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compared to the wording in Flight.rst, it is not as clear what this is trying to say
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've copied some missing descriptions from Flight.rst.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can put together Java support too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow! Thanks!
Could you push it to this branch when you work on it? Or you can create another branch for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we be asserting that this is true here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it may be a more straightforward approach.
I'll use the approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQLite doesn't tell us whether the result set is sorted or not, so I suppose technically this should always be 'true'? (That said, this server also only ever returns one endpoint so it's moot either way.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use true here because this flight server always returns one endpoint.
But I think that false is better here because false means "unknown order".
A real Flight SQL server will set true here only when it analyzes the given query and detects ORDER BY in a main SELECT.
How about adding a comment something like "TODO: Set ordered to true only when the query has ORDER BY" here?
cpp/src/arrow/flight/types.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not familiar with flight so maybe a dumb question: does the user know which order it is? Like ascending/descending and what are the sorted columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, but no, this is just a semantic flag to indicate that (unlike the current status) there is some ordering on the dataset. It's still up to the application to define what that means.
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks!
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
|
@github-actions crossbow submit preview-docs |
|
Revision: 4c2b78c Submitted crossbow builds: ursacomputing/crossbow @ actions-0d6f29303c
|
| consumption; this is up to the application to implement. | ||
| consumption; this is up to the application to implement. The client | ||
| can also use ``FlightInfo.ordered``. See the previous item for | ||
| details of ``FlightInfo.ordered``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to update this sentence...
If anyone has a suggestion for wording, please share it.
|
@github-actions crossbow submit preview-docs |
|
Revision: 2cc8304 Submitted crossbow builds: ursacomputing/crossbow @ actions-4c9d228232
|
|
The vote thread: https://lists.apache.org/thread/gnxv7cm5l01zgfos7mw52gr2337qofln The vote result (passed): https://lists.apache.org/thread/v05hxvffc118qtd4z4yqclryb7qslgfv I'll merge this. |
|
Thank you @kou |
|
Benchmark runs are scheduled for baseline = aa8ffbe and contender = b73ddc3. b73ddc3 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
|
['Python', 'R'] benchmarks have high level of regressions. |
…ta (apache#35178) ### Rationale for this change No ordering is unnecessarily limiting. Systems can and do implement distributed sorts, but they can’t reflect this in Flight RPC. ### What changes are included in this PR? These changes add `FlightInfo.ordered`. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. **This PR includes breaking changes to public APIs.** * Closes: apache#34852 * Closes: apache#35085 Lead-authored-by: Sutou Kouhei <kou@clear-code.com> Co-authored-by: David Li <li.davidm96@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…ta (apache#35178) ### Rationale for this change No ordering is unnecessarily limiting. Systems can and do implement distributed sorts, but they can’t reflect this in Flight RPC. ### What changes are included in this PR? These changes add `FlightInfo.ordered`. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. **This PR includes breaking changes to public APIs.** * Closes: apache#34852 * Closes: apache#35085 Lead-authored-by: Sutou Kouhei <kou@clear-code.com> Co-authored-by: David Li <li.davidm96@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…ta (apache#35178) ### Rationale for this change No ordering is unnecessarily limiting. Systems can and do implement distributed sorts, but they can’t reflect this in Flight RPC. ### What changes are included in this PR? These changes add `FlightInfo.ordered`. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. **This PR includes breaking changes to public APIs.** * Closes: apache#34852 * Closes: apache#35085 Lead-authored-by: Sutou Kouhei <kou@clear-code.com> Co-authored-by: David Li <li.davidm96@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Rationale for this change
No ordering is unnecessarily limiting. Systems can and do implement distributed sorts, but they can’t reflect this in Flight RPC.
What changes are included in this PR?
These changes add
FlightInfo.ordered.Are these changes tested?
Yes.
Are there any user-facing changes?
Yes.
This PR includes breaking changes to public APIs.