-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-35500: [C++][Go][Java][FlightRPC] Add support for result set expiration #36009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
|
It's not completed yet. See TODO list in the description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can work on Java next week
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
format/Flight.proto
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to remove Action:
| message ActionCancelFlightInfoResult { | |
| message CancelFlightInfoResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 from me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 from me too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I've removed Action.
format/Flight.proto
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to remove Cancel:
| enum CancelResult { | |
| enum Result { |
Or we may want to move this to top-level and return this enum directly instead of wrapping by ActionCancelFlightInfoResult message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least in Protobuf/gRPC itself, you can only return messages and not bare enums
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh...
format/Flight.proto
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CANCEL_RESULT_ prefix may be redundant.
| CANCEL_RESULT_UNSPECIFIED = 0; | |
| UNSPECIFIED = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use prefix.
https://protobuf.dev/programming-guides/style/#enums
Prefer prefixing enum values instead of surrounding them in an enclosing message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the reason (in case you didn't find it) is that enums in Protobuf use C-style namespacing, which is carried over into generated code. So it will end up as ActionCancelFlightInfoResult::[CANCEL_RESULT_]UNSPECIFIED, and so if you ever need a second enum, you need to prefix the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved CancelResult to top-level to reduce ActionCancelFlightInfoResult::... prefix.
Is it OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use CancelStatus or something instead of CancelResult to use Result in CancelResult and ActionCancelFlightInfoResult?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, to avoid confusion with Arrow Result?
I guess Arrow also uses Status, so both are the same to me. If using CancelStatus is clearer to you, then it sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, to avoid confusion with Arrow
Result?
No. I just want to avoid Result duplication in CancelFlightInfoResult and CancelResult.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, we have message Result in Flight.proto. It may confuse us with Arrow Result...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, we have
message ResultinFlight.proto. It may confuse us with ArrowResult...
Yes, the Flight one actually happened to come first, but it's unfortunate...
No. I just want to avoid Result duplication in CancelFlightInfoResult and CancelResult.
Ok. I think status is also good then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK!
zeroshade
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kou would you like me contribute the Go implementation? Or do you have that in the works? I have the time to do so if you need / want the help.
format/Flight.proto
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 from me too
|
@zeroshade Thanks for the offer! I have the Go implementation in local and I just need to implement one more integration test to cover all cases implemented in the C++ implementation. I'll be able to complete it and push to this branch today. But the Go implementation is needed to be cleaned up. Please review and/or push improvements to this branch after I push the Go implementation. |
ec1d2ad to
c2f2f05
Compare
|
The vote carried: https://lists.apache.org/thread/no26s310qn3v0n5x830d50k598fh0pvr I'll merge this. |
|
Conbench analyzed the 6 benchmark runs on commit There were 7 benchmark results indicating a performance regression:
The full Conbench report has more details. |
… expiration (apache#36009) ### Rationale for this change Currently, it is undefined whether a client can call DoGet more than once. Clients may want to retry requests, and servers may not want to persist a query result forever. ### What changes are included in this PR? Add an expiration time to FlightEndpoint. If present, clients may assume they can retry DoGet requests. Otherwise, clients should avoid retrying DoGet requests. This is not a full retry protocol. Also, add "pre-defined" actions to Flight RPC for working with result sets. These are pre-defined Protobuf messages with standardized encodings for use with DoAction: * CancelFlightInfo: Asynchronously cancel the execution of a distributed query. (Replaces the equivalent Flight SQL action.) * RefreshFlightEndpoint: Request an extension of the expiration of a FlightEndpoint. This lets the ADBC/JDBC/ODBC drivers for Flight SQL explicitly manage result set lifetimes. These can be used with Flight SQL as regular actions. #### Backward compatibility Flight SQL's CancelQuery is deprecated by Flight RPC's CancelFlightInfo. But some clients may not be able to migrate to CancelFlightInfo entirely. If a client can assume that a server requires 13.0.0 or later, the client can always use CancelFlightInfo. Otherwise, the client need to use CancelQuery (for old server) and/or CancelFlightInfo (for newer server). A server needs to implement only CancelFlightInfo. Because Flight SQL server libraries provide the default CancelQuery implementation that delegates to CancelFlightInfo. Clients can use both of Flight SQL's CancelQuery and Flight RPC's CancelFLightInfo by this feature. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. * Closes: apache#35500 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Rationale for this change
Currently, it is undefined whether a client can call DoGet more than once. Clients may want to retry requests, and servers may not want to persist a query result forever.
What changes are included in this PR?
Add an expiration time to FlightEndpoint. If present, clients may assume they can retry DoGet requests. Otherwise, clients should avoid retrying DoGet requests.
This is not a full retry protocol.
Also, add "pre-defined" actions to Flight RPC for working with result sets. These are pre-defined Protobuf messages with standardized encodings for use with DoAction:
This lets the ADBC/JDBC/ODBC drivers for Flight SQL explicitly manage result set lifetimes. These can be used with Flight SQL as regular actions.
Backward compatibility
Flight SQL's CancelQuery is deprecated by Flight RPC's CancelFlightInfo. But some clients may not be able to migrate to CancelFlightInfo entirely.
If a client can assume that a server requires 13.0.0 or later, the client can always use CancelFlightInfo. Otherwise, the client need to use CancelQuery (for old server) and/or CancelFlightInfo (for newer server).
A server needs to implement only CancelFlightInfo. Because Flight SQL server libraries provide the default CancelQuery implementation that delegates to CancelFlightInfo. Clients can use both of Flight SQL's CancelQuery and Flight RPC's CancelFLightInfo by this feature.
Are these changes tested?
Yes.
Are there any user-facing changes?
Yes.