Skip to content

Conversation

@kou
Copy link
Member

@kou kou commented Jun 9, 2023

Rationale for this change

Currently, it is undefined whether a client can call DoGet more than once. Clients may want to retry requests, and servers may not want to persist a query result forever.

What changes are included in this PR?

Add an expiration time to FlightEndpoint. If present, clients may assume they can retry DoGet requests. Otherwise, clients should avoid retrying DoGet requests.

This is not a full retry protocol.

Also, add "pre-defined" actions to Flight RPC for working with result sets. These are pre-defined Protobuf messages with standardized encodings for use with DoAction:

  • CancelFlightInfo: Asynchronously cancel the execution of a distributed query. (Replaces the equivalent Flight SQL action.)
  • RefreshFlightEndpoint: Request an extension of the expiration of a FlightEndpoint.

This lets the ADBC/JDBC/ODBC drivers for Flight SQL explicitly manage result set lifetimes. These can be used with Flight SQL as regular actions.

Backward compatibility

Flight SQL's CancelQuery is deprecated by Flight RPC's CancelFlightInfo. But some clients may not be able to migrate to CancelFlightInfo entirely.

If a client can assume that a server requires 13.0.0 or later, the client can always use CancelFlightInfo. Otherwise, the client need to use CancelQuery (for old server) and/or CancelFlightInfo (for newer server).

A server needs to implement only CancelFlightInfo. Because Flight SQL server libraries provide the default CancelQuery implementation that delegates to CancelFlightInfo. Clients can use both of Flight SQL's CancelQuery and Flight RPC's CancelFLightInfo by this feature.

Are these changes tested?

Yes.

Are there any user-facing changes?

Yes.

@github-actions
Copy link

github-actions bot commented Jun 9, 2023

⚠️ GitHub issue #35500 has been automatically assigned in GitHub to PR creator.

@kou
Copy link
Member Author

kou commented Jun 9, 2023

It's not completed yet. See TODO list in the description.
The C++ implementation only exists for now.
I'll add the Go implementation and update documents then clean up the current C++ implementation.
I think that we need to consider how to keep backward compatibility in Flight SQL API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can work on Java next week

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting committer review Awaiting committer review awaiting changes Awaiting changes awaiting change review Awaiting change review labels Jun 9, 2023
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to remove Action:

Suggested change
message ActionCancelFlightInfoResult {
message CancelFlightInfoResult {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from me too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I've removed Action.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to remove Cancel:

Suggested change
enum CancelResult {
enum Result {

Or we may want to move this to top-level and return this enum directly instead of wrapping by ActionCancelFlightInfoResult message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least in Protobuf/gRPC itself, you can only return messages and not bare enums

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CANCEL_RESULT_ prefix may be redundant.

Suggested change
CANCEL_RESULT_UNSPECIFIED = 0;
UNSPECIFIED = 0;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use prefix.

https://protobuf.dev/programming-guides/style/#enums

Prefer prefixing enum values instead of surrounding them in an enclosing message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the reason (in case you didn't find it) is that enums in Protobuf use C-style namespacing, which is carried over into generated code. So it will end up as ActionCancelFlightInfoResult::[CANCEL_RESULT_]UNSPECIFIED, and so if you ever need a second enum, you need to prefix the name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved CancelResult to top-level to reduce ActionCancelFlightInfoResult::... prefix.
Is it OK?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use CancelStatus or something instead of CancelResult to use Result in CancelResult and ActionCancelFlightInfoResult?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, to avoid confusion with Arrow Result?

I guess Arrow also uses Status, so both are the same to me. If using CancelStatus is clearer to you, then it sounds good to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, to avoid confusion with Arrow Result?

No. I just want to avoid Result duplication in CancelFlightInfoResult and CancelResult.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, we have message Result in Flight.proto. It may confuse us with Arrow Result...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, we have message Result in Flight.proto. It may confuse us with Arrow Result...

Yes, the Flight one actually happened to come first, but it's unfortunate...

No. I just want to avoid Result duplication in CancelFlightInfoResult and CancelResult.

Ok. I think status is also good then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK!

@github-actions github-actions bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Jun 10, 2023
Copy link
Member

@zeroshade zeroshade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kou would you like me contribute the Go implementation? Or do you have that in the works? I have the time to do so if you need / want the help.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from me too

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Jun 13, 2023
@kou
Copy link
Member Author

kou commented Jun 14, 2023

@zeroshade Thanks for the offer! I have the Go implementation in local and I just need to implement one more integration test to cover all cases implemented in the C++ implementation. I'll be able to complete it and push to this branch today. But the Go implementation is needed to be cleaned up. Please review and/or push improvements to this branch after I push the Go implementation.

@github-actions github-actions bot added Component: Go awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jun 14, 2023
@kou kou force-pushed the flight-result-set-expiration branch from ec1d2ad to c2f2f05 Compare June 26, 2023 09:14
@kou
Copy link
Member Author

kou commented Jul 3, 2023

The vote carried: https://lists.apache.org/thread/no26s310qn3v0n5x830d50k598fh0pvr

I'll merge this.

@kou kou merged commit 0b7bd74 into apache:main Jul 3, 2023
@kou kou deleted the flight-result-set-expiration branch July 3, 2023 00:39
@kou kou removed the awaiting change review Awaiting change review label Jul 3, 2023
@conbench-apache-arrow
Copy link

Conbench analyzed the 6 benchmark runs on commit 0b7bd74d.

There were 7 benchmark results indicating a performance regression:

The full Conbench report has more details.

pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
… expiration (apache#36009)

### Rationale for this change

Currently, it is undefined whether a client can call DoGet more than once. Clients may want to retry requests, and servers may not want to persist a query result forever.

### What changes are included in this PR?

Add an expiration time to FlightEndpoint. If present, clients may assume they can retry DoGet requests. Otherwise, clients should avoid retrying DoGet requests.

This is not a full retry protocol.

Also, add "pre-defined" actions to Flight RPC for working with result sets. These are pre-defined Protobuf messages with standardized encodings for use with DoAction:

  * CancelFlightInfo: Asynchronously cancel the execution of a distributed query. (Replaces the equivalent Flight SQL action.)
  * RefreshFlightEndpoint: Request an extension of the expiration of a FlightEndpoint.

This lets the ADBC/JDBC/ODBC drivers for Flight SQL explicitly manage result set lifetimes. These can be used with Flight SQL as regular actions.

#### Backward compatibility

Flight SQL's CancelQuery is deprecated by Flight RPC's CancelFlightInfo. But some clients may not be able to migrate to CancelFlightInfo entirely.

If a client can assume that a server requires 13.0.0 or later, the client can always use CancelFlightInfo. Otherwise, the client need to use CancelQuery (for old server) and/or CancelFlightInfo (for newer server).

A server needs to implement only CancelFlightInfo. Because Flight SQL server libraries provide the default CancelQuery implementation that delegates to CancelFlightInfo. Clients can use both of Flight SQL's CancelQuery and Flight RPC's CancelFLightInfo by this feature.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.

* Closes: apache#35500

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[C++][Go][Java][FlightRPC] Add support for result set expiration

3 participants