Skip to content

Conversation

@crepererum
Copy link
Contributor

Which issue does this PR close?

Closes #116.

Rationale for this change

Change the ObjectStore core trait to have a single, extensible copy operation. This helps #385 and #297.

Also adds extensions similar to
apache/arrow-rs#7170
and
apache/arrow-rs#7213 .

What changes are included in this PR?

Interface change, see section below.

Are there any user-facing changes?

  • new core method copy_opts alongside CopyOptions & CopyMode
  • copy & copy_if_not_exists are moved to ObjectStoreExt

Change the `ObjectStore` core trait to have a single, extensible copy
operation. This helps apache#385 and apache#297.

Also adds extensions similar to 
apache/arrow-rs#7170
and
apache/arrow-rs#7213 .

Closes apache#116.
@crepererum crepererum marked this pull request as ready for review November 20, 2025 10:42
@crepererum crepererum requested a review from alamb November 20, 2025 10:43
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @crepererum

I found reviewing with whitespace blind diff easier to understand: https://github.com/apache/arrow-rs-object-store/pull/548/files?w=1

I have a few suggestions related to API but I don't think they are required

cc @kylebarron

/// These extensions are ignored entirely by backends offered through this crate.
///
/// They are also eclused from [`PartialEq`] and [`Eq`].
/// They are also excluded from [`PartialEq`] and [`Eq`].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

#[derive(Debug, Clone, Default)]
pub struct CopyOptions {
/// Configure the [`CopyMode`] for this operation
pub mode: CopyMode,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we make all these fields pub I don't think we will be able to add new fields (e.g a different target object store) without a breaking API change

So I would suggest either marking these as non-pub or marking the struct as non_exhaustitve

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding new fields IS a breaking change by design, implementations shouldn't just ignore them. This is also in line with all the other options structs. This is also enforced by fully destructing the options in the respective methods (so you cannot forget to handle new fields).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was that one of the stated rationales for extracting options into CopyOptions is to help enable

However, it seems like if we go with this API we will not be able to add fields to CopyOptions to enable cross bucket copying without another breaking API change

Perhaps I am missing something

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to be able to support cross-bucket copies in the future without a breaking change if possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot square the circle here: having a simple API (esp. not having hundreds of specialized methods or traits) + no breaking change is IMHO not gonna work. TBH I don't see what's the reason for avoiding breaking changes, it's a 0.x crate after all. And it's the question of "how much breaking" it is. We're not re-designing the entire interface for that again.

Different example: if we would add "attributes" to the copy operation (which might be actually happening at one point), then a store implementation HAS TO consider them. You cannot just YOLO it and drop that information to the floor. Hence, this would be a breaking change, not only because of the interface change, but purely based on the fact that there's a semantic change/addition to the operation that implementation HAVE TO consider.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possibility would be to setup that style as part of this PR (first for CopyOptions) and then apply it to other options / attributes in a series of follow on PRs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do this, but I think that's probably a separate ticket, and we'd want to do it globally.

So we're now waiting for the ideal solution? Should we do that for 0.13.x?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinions on the matter, I lack the time to drive such an initiative myself, but if someone else is willing to do so, I think it would be valuable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we have a PR here I vote we proceed with this PR and file a ticket to track a more easily migratable API going forward. I will file such a ticket

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.extensions = extensions;
self
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend also adding accessors and remove the pub from the fields for the reasons listed above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree as per #548 (comment)

@alamb
Copy link
Contributor

alamb commented Nov 25, 2025

Since in my opinion this PR is better than main, I'll merge it in

We can discuss improving the API more on

@alamb alamb merged commit 37ee020 into apache:main Nov 25, 2025
8 checks passed
@alamb
Copy link
Contributor

alamb commented Nov 25, 2025

Thanks @crepererum @kylebarron and @tustvold

crepererum added a commit to crepererum/arrow-rs-object-store that referenced this pull request Nov 26, 2025
Change the `ObjectStore` core trait to have a single, extensible rename
operation. This helps apache#385 and apache#297.

Also adds extensions similar to 
apache/arrow-rs#7170
and
apache/arrow-rs#7213 .

Also see apache#548 -- which did something similar for `copy`.
crepererum added a commit to crepererum/arrow-rs-object-store that referenced this pull request Nov 26, 2025
Change the `ObjectStore` core trait to have a single, extensible rename
operation. This helps apache#385 and apache#297.

Also adds extensions similar to 
apache/arrow-rs#7170
and
apache/arrow-rs#7213 .

Also see apache#548 -- which did something similar for `copy`.
crepererum added a commit that referenced this pull request Dec 4, 2025
Change the `ObjectStore` core trait to have a single, extensible rename
operation. This helps #385 and #297.

Also adds extensions similar to 
apache/arrow-rs#7170
and
apache/arrow-rs#7213 .

Also see #548 -- which did something similar for `copy`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add CopyOptions

4 participants