Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FlightSQL (CLI) & Flight ignore endpoint locations #6487

Open
crepererum opened this issue Oct 1, 2024 · 0 comments
Open

FlightSQL (CLI) & Flight ignore endpoint locations #6487

crepererum opened this issue Oct 1, 2024 · 0 comments
Labels
arrow-flight Changes to the arrow-flight crate bug

Comments

@crepererum
Copy link
Contributor

Describe the bug
The FlightSQL CLI iterates over the endpoints in FlightInfo:

async fn execute_flight(
client: &mut FlightSqlServiceClient<Channel>,
info: FlightInfo,
) -> Result<Vec<RecordBatch>> {
let schema = Arc::new(Schema::try_from(info.clone()).context("valid schema")?);
let mut batches = Vec::with_capacity(info.endpoint.len() + 1);
batches.push(RecordBatch::new_empty(schema));
info!("decoded schema");
for endpoint in info.endpoint {
let Some(ticket) = &endpoint.ticket else {
bail!("did not get ticket");
};
let mut flight_data = client.do_get(ticket.clone()).await.context("do get")?;
log_metadata(flight_data.headers(), "header");
let mut endpoint_batches: Vec<_> = (&mut flight_data)
.try_collect()
.await
.context("collect data stream")?;
batches.append(&mut endpoint_batches);
if let Some(trailers) = flight_data.trailers() {
log_metadata(&trailers, "trailer");
}
}
info!("received data");
Ok(batches)
}

However it only passes the ticket to DoGet, but ignores the location field of the respective endpoint.

To Reproduce
I'm not aware of system that uses different locations for GetFlightInfo and DoGet, but technically we're violating the protocol here. You could create a test server that runs on two ports and serves all but DoGet on one port and DoGet on the other, refusing the wrong method on the wrong port via an HTTP error.

Expected behavior
Use (or at least double-check) the location for the provided endpoints.

Additional context
We could fix that within the CLI code by potentially creating a new FlightSqlServiceClient if the location changed, however I think the FlightSqlServiceClient::do_get and FlightClient::do_get interface are somewhat misleading, because they suggest that you should use the same tonic::transport::Channel (and hence the same URI) for GetFlightInfo and DoGet -- which is not necessarily true. So while the bug manifests in the CLI, I think this is a Rust API bug.

@crepererum crepererum added arrow-flight Changes to the arrow-flight crate bug labels Oct 1, 2024
@crepererum crepererum changed the title FlightSQL (CLI) ignores endpoint locations FlightSQL (CLI) & Flight ignore endpoint locations Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow-flight Changes to the arrow-flight crate bug
Projects
None yet
Development

No branches or pull requests

1 participant