-
Notifications
You must be signed in to change notification settings - Fork 838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add with_tokio_runtime to HTTP stores #4040
Conversation
use tracing::info; | ||
|
||
#[derive(Debug)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This module is not public, and so these changes are not breaking
/// This is unlike the public [`ClientOptions`](crate::ClientOptions) which contains just | ||
/// the properties used to construct [`Client`](reqwest::Client) | ||
#[derive(Debug, Clone, Default)] | ||
pub struct ClientConfig { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The separation of ClientOptions and ClientConfig is perhaps a little derived, but ClientConfig is a crate-private implementation detail and so I think this is fine.
|
||
match config.runtime.as_ref() { | ||
Some(handle) => handle | ||
.spawn(fut) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is worth highlighting that this only spawns the code that generates the Response, the Response streaming can and will take place in the calling context. This is perfectly acceptable as the mio reactor registration will have occurred already, the futures plumbing is runtime agnostic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth to put your entire comment in code as a code comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow your argument here. The underlying socket is registered w/ the IO runtime and so is the mio reactor. However we still cross-poll. So is our assumption that when polling data from the IO runtime to which mio just has written to, mio will never change its mind and "jump" to another runtime?
f2ee3d8
to
0165888
Compare
#[tokio::test] | ||
async fn http_test() { | ||
/// Deletes any directories left behind from previous tests | ||
async fn cleanup_directories(integration: &HttpStore) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is necessary because we now run the test twice, and the directories left behind cause tests of list_with_delimiter to fail.
I have confirmed that this behaviour of returning common prefixes for empty directories is consistent with LocalFileSystem. The reason we don't run into this with LocalFileSystem is that it creates a new temp directory for each test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is OK, but IMHO quite risky since it assumes behavior that I'm not sure counts as a stable interface.
|
||
match config.runtime.as_ref() { | ||
Some(handle) => handle | ||
.spawn(fut) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth to put your entire comment in code as a code comment.
|
||
match config.runtime.as_ref() { | ||
Some(handle) => handle | ||
.spawn(fut) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow your argument here. The underlying socket is registered w/ the IO runtime and so is the mio reactor. However we still cross-poll. So is our assumption that when polling data from the IO runtime to which mio just has written to, mio will never change its mind and "jump" to another runtime?
Marking as a draft whilst I think a bit more on this, another option might be to do something similar to https://docs.rs/async-compat/latest/async_compat/ and return decorated types |
Looping back to this I think this problem is ill-formed. There are two major use-cases for this functionality:
The first use-case is better served by integrating tokio at a higher level, e.g. using It is unclear how to handle the second use-case at a library level. The use of a second threadpool implies that the primary threadpool may have very high tail latencies. The problem is determining at what point this should result in back pressure on the underlying TCP connection. As written this PR will not change the way that this backpressure occurs, should the task not get scheduled on the high tail latency threadpool, nothing will drain the TCP socket, and TCP backpressure will occur. The approach in #4015 instead uses a queue with capacity for a single chunk, which will delay this TCP backpressure very slightly. You could increase the queue size, or make a more sophisticated queue that buffers a given number of bytes, but it is unclear how users would control this buffering behaviour. Taking a step back this feels like the wrong way to solve this problem, ultimately IO should be segregated from compute at a meaningful application task boundary, rather than at the object_store interface. For example, AsyncFileReader::get_bytes could perform the IO to fetch a given chunk of data on a separate thread pool. This avoids object_store having to make decisions about how much buffering is too much, etc... I am therefore going to close this PR |
Which issue does this PR close?
Closes #.
Rationale for this change
This allows isolating IO into a separate pool, allowing using object_store outside of a tokio context
What changes are included in this PR?
Are there any user-facing changes?