-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-33671 / shutil.copyfile: use memoryview() with dynamic size on Windows #7681
Conversation
...as it's considerably slower for smaller files; e.g. with a 8MiB file: --- original ~/svn/cpython {33671-release-memoryview}$ ./python -m timeit -s "import shutil; f1 = open('f1', 'rb'); f2 = open('f2', 'wb')" "shutil.copyfileobj(f1, f2)" 200000 loops, best of 5: 1.64 usec per loop --- _copybinfileobj() ~/svn/cpython {33671-release-memoryview}$ ./python -m timeit -s "import shutil; f1 = open('f1', 'rb'); f2 = open('f2', 'wb')" "shutil.copyfileobj(f1, f2)" 100000 loops, best of 5: 3.3 usec per loop
...and use it from copyfile() only if WINDOWS and file size >= 128 MiB
Lib/shutil.py
Outdated
|
||
if _HAS_FCOPYFILE: | ||
# macOS | ||
elif _HAS_FCOPYFILE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both options are always exclusive? macOS might have sendfile() in the future, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now outdated: I updated code to always use fcopyfile on OSX and leave sendfile() alone. This way we avoid invoking sendfile() on OSX which would fail anyway.
Lib/shutil.py
Outdated
# considerable speedup by using a readinto()/memoryview() | ||
# variant of copyfileobj(), see: | ||
# https://github.com/python/cpython/pull/7160#discussion_r195162475 | ||
elif _WINDOWS and file_size >= 128 * 1024 * 1024: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind to add a top-level private constant rather than an hardcoded value? (Would it make sense to make it somehow public? Or add an optional parameter...?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is outdated, I modified code to always use the readinto() variant regardless of file size, see: #7160 (comment)
As per #7160 (comment) it appears using memoryview() on Windows is faster also for small files if memoryview() length is equal to file size.
@giampaolo: Please replace |
This is a follow up of #7160.
Changes:
memoryview()
with size == file size, see bpo-33671: efficient zero-copy for shutil.copy* functions (Linux, OSX and Win) #7160 (comment)memoryview
immediatelyhttps://bugs.python.org/issue33671