-
Notifications
You must be signed in to change notification settings - Fork 67
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Currently streaming uploads are supported by ObjectStore::put_multipart
. This returns a AsyncWrite
, which provides a push-based interface for writing data.
However, this approach is not without issue:
- No obvious way to return PutResult for parts or the final object - In Object Store, return version & etag on multipart put. #86
- No obvious way to retry uploading of a single part - AsyncRead/AsyncWrite Poisoning Behaviour #87
- Unclear poisoning behaviour - AsyncRead/AsyncWrite Poisoning Behaviour #87
- Cannot support resuming uploads - continue existing multi-part upload #123 APIs for directly managing multi-part uploads and saving potential parquet footers arrow-rs#4608
- No obvious way to support Attributes - Add ObjectStore::put_multipart_opts arrow-rs#5435 Add Attributes API Exposing Broader Set of Object Metadata #94
- AsyncWrite design can easily lead to timeouts - Multipart upload can leave futures unpolled, leading to timeout #93 Async write_all sometimes silently fails to write to file tokio-rs/tokio#4296
- The way we implement poll_flush and poll_shutdown is not entirely in keeping with the AsyncWrite contract, e.g. poll_flush may not flush all buffered data
- The ecosystem hasn't settled on a single IO trait for AsyncWrite (because they all have their own issues) - https://github.com/nrc/portable-interoperable/blob/master/io-traits/README.md
- Data is copied potentially multiple times to/from buffers
- Parallelism is controlled by the ObjectStore implementation internally with no way to control this
- AsyncWrite is tricky to integrate with synchronous code, despite the fact the internal buffering should make it straightforward
- Cannot easily track upload progress - Any way to track the progress when uploading a big file with ObjectStore::put_multipart? arrow-rs#5117
Describe the solution you'd like
apache/arrow-rs#4971 added a MultipartStore
abstraction that more closely mirrors the APIs exposed by object stores, avoiding all of the above issues. If we could devise a way to implement this interface for LocalFileSystem
we could then "promote" it into the ObjectStore
trait and deprecate put_multipart. This would provide the maximum flexibility to users, whilst being in keeping with the objectives of this crate to closely hew to the APIs of the stores themselves.
The key observation that makes this possible, is that we already recommend MultiPartStore
be used with fixed size chunks for compatibility with r2, we therefore could require this for LocalFilesystem, in turn allowing it to support out-of-order / parallel writes as the file offsets can be determined from the part index.
apache/arrow-rs#5431 and apache/arrow-rs#4857 added BufWriter
and BufReader
and these would be retained to preserve compatibility with the tokio ecosystem and provide a more idiomatic API on top of this
Describe alternatives you've considered
I briefly considered a put_stream API, however, this doesn't resolve many of the above issues
We could also just implement MultipartStore for LocalFilesystem, whilst retaining the current put_multipart
. This would allow downstreams to opt-in to the lower level API if they so wished.
We could also modify put_multipart to return something other than AsyncWrite, possibly something closer to PutPart
Additional context
Many of the stores also support composing objects from others, this might be something to consider in this design - #121