Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-32689: Updates shutil.move to allow for Path objects to be used as source arg #15326

Merged
merged 16 commits into from
Oct 1, 2019

Conversation

maxwellmckinnon
Copy link
Contributor

@maxwellmckinnon maxwellmckinnon commented Aug 18, 2019

Important work originally done by @emilyemorehouse two years ago and nearly ready to go in.

This bug has affected many people and in some cases has been a dealbreaker to the adoption of the otherwise wonderful pathlib and PEP519. https://stackoverflow.com/questions/33625931/copy-file-with-pathlib-in-python.

This adds the outstanding test request from that PR @vstinner (#5393).

Test fails without the change, passes with it, along with every other test in test_shutil.

Some variants were experimented with to make the one line change and the most performant one was picked.

Added Test for PathLike directory destination, the current fail case

Lib/test/test_shutil.py::TestMove::test_move_file_pathlike FAILED                                                               [100%]

============================================================== FAILURES ===============================================================
__________________________________________________ TestMove.test_move_file_pathlike ___________________________________________________

self = <test.test_shutil.TestMove testMethod=test_move_file_pathlike>

    def test_move_file_pathlike(self):
        # Move a file to another location on the same filesystem.
        src = pathlib.Path(self.src_file)
>       self._check_move_file(src, self.dst_dir, self.dst_file)

Lib/test/test_shutil.py:1563:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Lib/test/test_shutil.py:1545: in _check_move_file
    shutil.move(src, dst)
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/shutil.py:562: in move
    real_dst = os.path.join(dst, _basename(src))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

path = PosixPath('/var/folders/r2/psq74t5x3nbfzlph8bh2pvdw0000gn/T/tmp9ie0wh9_/foo')

    def _basename(path):
        # A basename() variant which first strips the trailing slash, if present.
        # Thus we always get the last component of the path, even for directories.
        sep = os.path.sep + (os.path.altsep or '')
>       return os.path.basename(path.rstrip(sep))
E       AttributeError: 'PosixPath' object has no attribute 'rstrip'

/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/shutil.py:526: AttributeError
============================================== 1 failed, 102 deselected in 0.30 seconds ===============================================

After change:

========================================================= test session starts =========================================================
platform darwin -- Python 3.7.4, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 -- /Users/maxwellmckinnon/.venvs/TA3.7/bin/python3.7
cachedir: .pytest_cache
rootdir: /Users/maxwellmckinnon/dev/cpython
plugins: cov-2.7.1, mock-1.10.4
collected 103 items / 102 deselected / 1 selected

Lib/test/test_shutil.py::TestMove::test_move_file_pathlike PASSED                                                               [100%]

============================================== 1 passed, 102 deselected in 0.06 seconds ===============================================

Running all the tests in test_shutil.py

╰─ pytest Lib/test/test_shutil.py -v
========================================================= test session starts =========================================================
platform darwin -- Python 3.7.4, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 -- /Users/maxwellmckinnon/.venvs/TA3.7/bin/python3.7
cachedir: .pytest_cache
rootdir: /Users/maxwellmckinnon/dev/cpython
plugins: cov-2.7.1, mock-1.10.4
collected 103 items

Lib/test/test_shutil.py::TestShutil::test_chown PASSED                                                                          [  0%]
Lib/test/test_shutil.py::TestShutil::test_copy PASSED                                                                           [  1%]
...
Lib/test/test_shutil.py::TermsizeTests::test_stty_match SKIPPED                                                                 [ 99%]
Lib/test/test_shutil.py::PublicAPITests::test_module_all_attribute PASSED                                                       [100%]

================================================ 96 passed, 7 skipped in 1.25 seconds =================================================

Performance Considerations

Is it considered poor form to get rid of _basename altogether and make use of pathlib in the move function? I'm not sure if the idea is for all these modules to strictly avoid circular dependencies. They are already using os.path which is just as much a citizen in 3.8 as pathlib right?

e.g.

real_dst = os.path.join(dst, _basename(src))
becomes
real_dst = Path(dst) / Path(src).name

I've looked around and familiarized myself, and I now think importing pathlib here is fine. My only remaining concern is that of performance.

Here's the performance difference for this step.

In [46]: %timeit real_dst = os.path.join("a/b/c", _basename('b/'))
2.71 µs ± 62.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [47]: %timeit real_dst = Path("a/b/c") / Path('b/').name
12.4 µs ± 65.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Is 10us significant or insignificant compared to the least expensive operation this function will do? I don't know. Let's find out.

In [55]: %timeit os.rename('/tmp/a/a.txt', '/tmp/a/b.txt'); os.rename('/tmp/a/b.txt', '/tmp/a/a.txt')
124 µs ± 2.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

62us to rename. 10us seems significant enough that we wouldn't want to favor the Path sugar suggestion. 16% speed decrease from adding the 10us.

What do people think? I was hoping to get to use pathlib.Path here, but I suspect for this low level move, it should be as fast as possible, and 16% is not worth one line of sugary code to me.

https://bugs.python.org/issue32689

Automerge-Triggered-By: @gvanrossum

emilyemorehouse and others added 3 commits February 13, 2018 14:13
…be used as source argument by casting to string before performing rstrip. Added documentation on why rstrip needs to be used in the helper _basename method.
…ctory in the form of os.PathLike (pathlib.Path(dir)).
@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

You can check yourself to see if the CLA has been received.

Thanks again for your contribution, we look forward to reviewing it!

Copy link
Member

@brandtbucher brandtbucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple minor nitpicks, but otherwise looks good! dst is already pathlike-safe, correct?

I'd also add a note to the docs for this in Doc/library/shutil.rst:

.. versionchanged:: 3.9
      Accepts a :term:`path-like object` for both *src* and *dst*.

I also vote -1 on the Path slowdown, for the reasons you mentioned.

maxwellmckinnon and others added 7 commits August 28, 2019 02:18
Will this fail the linter again?

Co-Authored-By: Brandt Bucher <brandtbucher@gmail.com>
Reverting the whitespace edit as the tests didn't pass
Fixed spacing failing doc building test
@maxwellmckinnon
Copy link
Contributor Author

It was a bit of digging to figure out what the CI failures were (the message to fix only works on linux), but it was a whitespace, fixed by running reindent.py -v ~/dev/cpython/Lib/shutil.py

@brandtbucher Ready to get this in?

@brandtbucher
Copy link
Member

@maxwellmckinnon: I'd still recommend removing the new isolated line of whitespace and cutting down on that docstring, but I think this is otherwise ready for core review.

CC: @giampaolo @emilyemorehouse

Copy link
Member

@ammaraskar ammaraskar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just gonna ping Emily on the original PR to see if she wants to finish off the work herself.

maxwellmckinnon and others added 4 commits September 18, 2019 15:30
Co-Authored-By: Ammar Askar <ammar_askar@hotmail.com>
Remove whitespace line after versionchanged:: 3.9
Added test for destination pathlike case as well (currently working but best to enforce the behavior!)
…quash this later -- ran Tools/scripts/reindent.py -v ~/dev/cpython/Lib/test/test_shutil.py
@maxwellmckinnon
Copy link
Contributor Author

Sorry for all the stray commits. I assume they will be squashed by the integrator? Or should I rebase and push as one commit now?

@ammaraskar
Copy link
Member

They'll be squashed by the integrator, no need to worry :)

@maxwellmckinnon
Copy link
Contributor Author

From Emily over email: “I haven't had the time to finish it, feel free to propose yours!”

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let this go into 3.9.

@miss-islington miss-islington merged commit cf57cab into python:master Oct 1, 2019
jacobneiltaylor pushed a commit to jacobneiltaylor/cpython that referenced this pull request Dec 5, 2019
…s source arg (pythonGH-15326)

Important work originally done by @emilyemorehouse two years ago and nearly ready to go in.

This bug has affected many people and in some cases has been a dealbreaker to the adoption of the otherwise wonderful pathlib and PEP519. https://stackoverflow.com/questions/33625931/copy-file-with-pathlib-in-python.

This adds the outstanding test request from that PR @vstinner (python#5393).

Test fails without the change, passes with it, along with every other test in test_shutil.

Some variants were experimented with to make the one line change and the most performant one was picked.


# Added Test for PathLike directory destination, the current fail case

```
Lib/test/test_shutil.py::TestMove::test_move_file_pathlike FAILED                                                               [100%]

============================================================== FAILURES ===============================================================
__________________________________________________ TestMove.test_move_file_pathlike ___________________________________________________

self = <test.test_shutil.TestMove testMethod=test_move_file_pathlike>

    def test_move_file_pathlike(self):
        # Move a file to another location on the same filesystem.
        src = pathlib.Path(self.src_file)
>       self._check_move_file(src, self.dst_dir, self.dst_file)

Lib/test/test_shutil.py:1563:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Lib/test/test_shutil.py:1545: in _check_move_file
    shutil.move(src, dst)
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/shutil.py:562: in move
    real_dst = os.path.join(dst, _basename(src))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

path = PosixPath('/var/folders/r2/psq74t5x3nbfzlph8bh2pvdw0000gn/T/tmp9ie0wh9_/foo')

    def _basename(path):
        # A basename() variant which first strips the trailing slash, if present.
        # Thus we always get the last component of the path, even for directories.
        sep = os.path.sep + (os.path.altsep or '')
>       return os.path.basename(path.rstrip(sep))
E       AttributeError: 'PosixPath' object has no attribute 'rstrip'

/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/shutil.py:526: AttributeError
============================================== 1 failed, 102 deselected in 0.30 seconds ===============================================
```

After change:

```
========================================================= test session starts =========================================================
platform darwin -- Python 3.7.4, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 -- /Users/maxwellmckinnon/.venvs/TA3.7/bin/python3.7
cachedir: .pytest_cache
rootdir: /Users/maxwellmckinnon/dev/cpython
plugins: cov-2.7.1, mock-1.10.4
collected 103 items / 102 deselected / 1 selected

Lib/test/test_shutil.py::TestMove::test_move_file_pathlike PASSED                                                               [100%]

============================================== 1 passed, 102 deselected in 0.06 seconds ===============================================
```

Running all the tests in test_shutil.py
```
╰─ pytest Lib/test/test_shutil.py -v
========================================================= test session starts =========================================================
platform darwin -- Python 3.7.4, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 -- /Users/maxwellmckinnon/.venvs/TA3.7/bin/python3.7
cachedir: .pytest_cache
rootdir: /Users/maxwellmckinnon/dev/cpython
plugins: cov-2.7.1, mock-1.10.4
collected 103 items

Lib/test/test_shutil.py::TestShutil::test_chown PASSED                                                                          [  0%]
Lib/test/test_shutil.py::TestShutil::test_copy PASSED                                                                           [  1%]
...
Lib/test/test_shutil.py::TermsizeTests::test_stty_match SKIPPED                                                                 [ 99%]
Lib/test/test_shutil.py::PublicAPITests::test_module_all_attribute PASSED                                                       [100%]

================================================ 96 passed, 7 skipped in 1.25 seconds =================================================
```

# Performance Considerations
Is it considered poor form to get rid of _basename altogether and make use of pathlib in the move function? I'm not sure if the idea is for all these modules to strictly avoid circular dependencies. They are already using os.path which is just as much a citizen in 3.8 as pathlib right?

e.g.

`real_dst = os.path.join(dst, _basename(src))`
becomes
`real_dst = Path(dst) / Path(src).name`

I've looked around and familiarized myself, and I now think importing pathlib here is fine. My only remaining concern is that of performance.

Here's the performance difference for this step. 

```
In [46]: %timeit real_dst = os.path.join("a/b/c", _basename('b/'))
2.71 µs ± 62.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [47]: %timeit real_dst = Path("a/b/c") / Path('b/').name
12.4 µs ± 65.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
```

Is 10us significant or insignificant compared to the least expensive operation this function will do? I don't know. Let's find out.

```
In [55]: %timeit os.rename('/tmp/a/a.txt', '/tmp/a/b.txt'); os.rename('/tmp/a/b.txt', '/tmp/a/a.txt')
124 µs ± 2.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```
62us to rename. 10us seems significant enough that we wouldn't want to favor the Path sugar suggestion. 16% speed decrease from adding the 10us.

What do people think? I was hoping to get to use pathlib.Path here, but I suspect for this low level move, it should be as fast as possible, and 16% is not worth one line of sugary code to me.



https://bugs.python.org/issue32689



Automerge-Triggered-By: @gvanrossum
ericmehl added a commit to ericmehl/gaffer that referenced this pull request Jan 30, 2023
We were leaving behind the temporary extraction directory which got included in the release archive.

Note that the use of `str( p )` in `shutil.move()` is needed until Python 3.9, where a parsing bug is fixed : python/cpython#15326
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants