Skip to content

pathlib ABCs: yield progress reports from WritablePath._copy_from() #131636

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

barneygale
Copy link
Contributor

@barneygale barneygale commented Mar 23, 2025

Make [Writable]Path._copy_from() yield (target, source, part_size) tuples when copying files and directories. A tuple with part_size=0 is emitted for every path encountered, and further tuples with part_size>0 may be emitted when copying regular files.

This should allow anyio.Path to wrap _copy_from() and make it cancelable.

Make `WritablePath._copy_from()` yield `(target, source, part_size)`
tuples when copying files and directories. A tuple with `part_size=0` is
emitted for every path encountered, and further tuples with `part_size>0`
**may** be emitted when copying regular files.

This should allow `anyio.Path` to wrap `_copy_from()` and make it
cancelable.
@barneygale
Copy link
Contributor Author

Sorry it took me ages to get around to this Thomas. I'm happy to adjust the patch to make your life as easy as possible, so let me know what you'd like!

@graingert
Copy link
Contributor

Amazing I'll try and integrate this into an anyio PR asap

@barneygale
Copy link
Contributor Author

FWIW, the _copy_from() method has an underscore prefix because I haven't yet found a good justification to add it to the public pathlib.Path interface, and "it's needed in the ABCs" doesn't cut it while the ABCs are still private. Frustratingly it's the only method in this category.

I wonder if converting it to a generator changes the picture? It might be useful for non-anyio users who want to track progress of large copies. If so I guess we'd rename it to iter_copy_from/similar.

@@ -1105,7 +1105,7 @@ def copy(self, target, **kwargs):
if not hasattr(target, 'with_segments'):
target = self.with_segments(target)
ensure_distinct_paths(self, target)
target._copy_from(self, **kwargs)
list(target._copy_from(self, **kwargs)) # Consume generator.
Copy link
Contributor

@graingert graingert Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per the docs in itertools the best way to consume a generator is collections.deque:

def consume(iterator, n=None):
    "Advance the iterator n-steps ahead. If n is None, consume entirely."
    # Use functions that consume iterators at C speed.
    if n is None:
        deque(iterator, maxlen=0)
    else:
        next(islice(iterator, n, n), None)

it's a neat little short-cut when deque is maxlen=0

if (maxlen == 0)
return consume_iterator(it);

Suggested change
list(target._copy_from(self, **kwargs)) # Consume generator.
collections.deque(target._copy_from(self, **kwargs), maxlen=0) # Consume generator.

@graingert
Copy link
Contributor

looking really good, ideally I'd like to see progress support for for move(_into) and rmdir

also it looks like move will double the storage usage until it finishes and starts deleting. Would it be possible to move each file at a time? If I cancel the operation I expect to have about the same amount of storage consumed in total

eg Path._copy_from(remove_source=True)

(I get this is a huge ask!)

@barneygale
Copy link
Contributor Author

That feels a tad dangerous to me, because a failed transfer would leave a user with two incomplete directory trees they'd need to reconcile.

(Path.move() tries to use os.replace() first, which avoids the issue when the source and target are on the same fs)

@graingert
Copy link
Contributor

ok that's fair, but for move I'd still like the progress of the delete, perhaps deletes could be tuples of (path, -1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants