Skip to content

Add fs.relative_to() function. Closes #2184 #7908

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

zeehio
Copy link

@zeehio zeehio commented Oct 28, 2020

The usefulness of a rel_path() function has been discussed on #2184.

I have a personal project where a binary installed at $prefix/$bindir would look for a file on $prefix/$datadir/subdirectory. With rel_path() I can compute /../share/subdirectory or whatever it is the path to $prefix/$datadir/subdirectory from $prefix/$bindir and pass that as a #define variable to the executable(), avoiding hardcoding the prefix.

@zeehio zeehio changed the title Add relpath function. Closes #2184 Add rel_path function. Closes #2184 Oct 28, 2020
@mensinda
Copy link
Member

There is already the FS-Module and this function should be added there instead.

@mensinda
Copy link
Member

Also please consider using pathlib instead of os.path.*. We are (admittedly slowly) trying to replace all os.path.* code with pathlib and new code should thus use pathlib whenever possible.

@zeehio zeehio marked this pull request as draft October 28, 2020 10:28
@zeehio zeehio force-pushed the feature-rel-path branch 2 times, most recently from 8b37f78 to f3715e1 Compare October 29, 2020 22:17
@zeehio zeehio changed the title Add rel_path function. Closes #2184 Add fs.relative_to() function. Closes #2184 Oct 30, 2020
@zeehio
Copy link
Author

zeehio commented Oct 30, 2020

Following your advice I have added a fs.relative_to function using pathlib.

While doing that I found out that the use of os.path.relpath does not have a clear translation to pathlib (see pathlib note about this as well as this stack overflow question). I believe the implementation I came up with is clear enough. While I think there may be room for improvement in pathlib to make it more convenient for our use case, I believe the semantic the pathlib implementation provides for relative_to() is more simple and easy to understand than os.path.relpath().

Python 3.9 introduced PurePath().is_relative_to() but since meson depends on older python versions I resorted to try/except a couple of times.

Python 3.10 has an open pull request to improve relative_to() covering our use case python/cpython#19813. If it is merged, our code could be simplified.

I limited the meson implementation to absolute paths. Relative paths may be supported later if desired.

@zeehio zeehio marked this pull request as ready for review October 30, 2020 05:57
@zeehio zeehio force-pushed the feature-rel-path branch 2 times, most recently from e7cc323 to c31b6bb Compare November 14, 2020 15:18
@zeehio
Copy link
Author

zeehio commented Jan 11, 2021

Ping. If you don't mind I will rebase once I have the ok of this contribution. (I'd rather rebase only once)

# On windows:
fs.relative_to('c:\\proj1\\foo', 'c:\\proj1\\bar') # '..\\foo'
fs.relative_to('c:\\proj1\\foo', 'd:\\proj1\\bar') # ERROR, since the relative path does not exist
fs.relative_to('c:\\proj1\\foo', 'd:\\proj1\\bar', within: 'd:\\') # 'c:\\proj1\\foo', within allows to get an absolute path
Copy link
Member

@jpakkane jpakkane Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This functionality is good, but the UI as it stands is a bit confusing. I had to read the documentation several times in order to understand what it does and why, Now this might just be a question of me being stupid, but it would seem prudent to try to get a nicer experience.

As an example, if the kwarg were called if_within or is_within its meaning would be more clear. It would indicate a condition rather than a general parameter of some sort.

Secondly, this is problematic in the Windows case if you don't directly control the paths yourself. If you get the file strings as arguments from somewhere, with values c:\foo and g:\bar you can't say that an absolute path is fine without extracting the drive letter as a substring. This is unnecessary work for users. Having a kwarg like allow_absolute_if_drive_letters_are_different would be simpler.

Finally, the native approach seems a bit limited as people may need to build different kinds of paths so it might be better to instead have a kwarg path_type with the values build, host, unix and windows with host being the default.

Copy link
Collaborator

@bonzini bonzini Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fs seems to be more obviously related to native paths. It may make sense to have path-related methods on the various machine objects instead (this is not filesystem manipulation and does not have the same pitfalls as other fs functions).

In that case, the function would not be implemented on fs at all since one would use the build machine object for native paths (instead, I don't see the benefit in hard-coding that one is dealing with POSIX or Windows paths).

Regarding the syntax, I'd propose something like

fs.relative_to('c:\\proj1\\foo', 'c:\\proj1\\bar')  # '..\\foo'
fs.relative_to('c:\\proj1\\foo', 'c:\\proj2\\bar', within: 'c:\\proj2)  # ERROR, the relative path is outside the "within" argument
fs.relative_to('c:\\proj1\\foo', 'd:\\proj1\\bar')  # ERROR, for Windows the default "within" is the drive
fs.relative_to('c:\\proj1\\foo', 'c:\\proj2\\bar', within: 'c:\\proj2', allow_absolute: true)  # 'c:\\proj1\\foo'
fs.relative_to('c:\\proj1\\foo', 'd:\\proj1\\bar', allow_absolute: true)  # 'c:\\proj1\\foo'

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bonzini Do we need the within if we have allow_absolute?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I see what you expect to have with each of those arguments

Copy link
Author

@zeehio zeehio Nov 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, both the within and allow_absolute arguments are a consequence of other design decisions of the meson language:

  • allow_absolute would not be necessary if it was easy to do something like:
try:
    something = fs.relative_to(path1, path2)
except:
    something = path1
  • within would not be necessary if it was easy to do:
if fs.contains(within, path1):
    path1 = path1.relative_to(path2)

Are you sure you want to use arguments in fs.relative_to() to implement those features instead of working towards a try/except and an contains() functionality? It seems to me you are trying to fix a language issue and a missing feature with arguments, but maybe I miss something.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the relative_to current behaviour from a functional perspective. I have implemented most of your suggestions:

relative_to

since 0.64.0

Given two paths, returns a version of the first path relative to the second.

Examples:

# On windows:
fs.relative_to('c:\\proj1\\foo', 'c:\\proj1\\bar')  # '..\\foo'
fs.relative_to('c:\\proj1\\foo', 'd:\\proj1\\bar')  # ERROR, since the relative path does not exist

# On other systems:
fs.relative_to('/prefix/lib', '/prefix/bin')  # '../lib'
fs.relative_to('/prefix/lib/foo', '/prefix')  # 'lib/foo'

relative_to has some optional keyword arguments:

  • native: Can be set to true or false. Whether the relative paths are computed
    following the rules of the build machine (native: true) or the host (native: false).

  • allow_absolute: If set to true, instead of giving an error, relative_to()
    will return the original path. Useful if an absolute path is a reasonable
    fallback.

  • if_within: A path. relative_to(path1, path2, within: path3) returns
    the relative path of path1 with respect to path2ifpath1is found inpath3, otherwise it returns path1` unchanged.

Examples:

# On windows:
fs.relative_to('c:\\proj1\\foo', 'c:\\proj2\\bar', if_within: 'c:\\proj2')  # ERROR, the relative path is outside the "within" argument
fs.relative_to('c:\\proj1\\foo', 'c:\\proj2\\bar', if_within: 'c:\\proj2', allow_absolute: true)  # 'c:\\proj1\\foo'
fs.relative_to('c:\\proj1\\foo', 'd:\\proj1\\bar', allow_absolute: true)  # 'c:\\proj1\\foo'

# On other systems:
fs.relative_to('/project1/lib/foo', '/usr/bin', if_within: '/usr')  # ERROR: '/project1/lib/foo' not in '/usr'
fs.relative_to('/project1/lib/foo', '/usr/bin', if_within: '/usr', allow_absolute: 'true')  # '/project1/lib/foo'
fs.relative_to('/usr/lib/foo', '/usr/bin', if_within: '/usr')  # '../lib/foo'

@tristan957
Copy link
Member

Would be nice to get this in for 0.59 if possible.

Is there any reason this can't be implemented using os.path.relpath()?

@eli-schwartz
Copy link
Member

eli-schwartz commented Jul 1, 2021

Did you read @jpakkane's review comments explaining an unrelated reason why?

@tristan957
Copy link
Member

If you are referring to the drive issues on Windows, I now see how that would be a problem for os.path.relpath (after testing locally). If that is what you are referring to.

@dcbaker
Copy link
Member

dcbaker commented Jul 1, 2021

It would be nice if this gets rebased if it moved the typed_ args decorators as well.

Copy link
Member

@eli-schwartz eli-schwartz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code seems very complicated and confusing to use, and it feels like that is mostly because it erroneously considers pathlib to be a good thing.

@zeehio
Copy link
Author

zeehio commented Oct 31, 2022

After some time upstream, python has merged

import pathlib
p = pathlib.Path("/a").relative_to("/b", walk_up=True)

For python 3.12

I have backported the pull request at:

Which can be installed with:

python -m pip install backports-pathlib-relative-to
from backports.pathlib import pathlib

pathlib.Path("/a").relative_to("/b", walk_up=True)

Backporting this walk_up= argument back to python 3.7.

If you agree to that backported dependency I'd be happy to rebase the pull request one more time.

@tristan957
Copy link
Member

@zeehio Meson chooses not to have any dependencies, so anything that isn't optional from PyPI wouldn't work. You could however copy the code from your repo into this PR.

@eli-schwartz
Copy link
Member

eli-schwartz commented Oct 31, 2022

No backports, Meson needs to run on systems with python 3.7 and before pip is available.

Can we just use:

path_1 = pathlib.Path("/a")
path_2 = pathlib.Path("/b")

os.path.relpath(path_1, path_2)

as I suggested back in April? Use of pathlib.Path to construct the initial arguments is optional, but you can do that if it feels aesthetically pleasing to you.

The documentation for windows paths has been improved
showing examples where it is not possible to compute
a relative path and how to get an absolute path in that
situation.

Since the computation of relative paths is system dependent,
relative_to now accepts a "native" argument to use either
the build (native: true) or the host (native: false, default)
system as reference.

Tests for windows paths are now available and error messages
have been clarified as well.
@zeehio zeehio requested review from eli-schwartz and removed request for mensinda November 2, 2022 17:44

```meson
# On windows:
fs.relative_to('c:\\proj1\\foo', 'c:\\proj2\\bar', if_within: 'c:\\proj2') # ERROR, the relative path is outside the "within" argument
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that an error? Doc above says " if path1 is found in path3, otherwise it returns path1 unchanged", so shouldn't this return c:\\proj1\\foo?

Copy link
Author

@zeehio zeehio Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example was right, the docs were wrong.

If path1 is not found on path3, one can do two things: either we return path1 or we error.

  • Choosing to always error is the easy path, but it is not convenient to users because as far as I know meson doesn't have a try/catch.
  • Returning path1 as was documented may lead to confusion if both path1 and path2 happen to be relative paths. In that scenario we could not always know if the returned path is the path1 we gave as input (because it wasn't contained in path3) or if the outcome is the result of computing it relative to path2 (because path1 was contained in path3). Therefore the only safe behaviour is to return path1 if it is absolute, or raise an error. But for consistency, absolute paths can only be returned if allow_absolute : true, so that argument would need to be set as well to avoid the error.

I've pushed a commit to clarify the documentation as well

zeehio and others added 2 commits November 2, 2022 20:27
Co-authored-by: Xavier Claessens <xclaesse@gmail.com>
@tristan957
Copy link
Member

My PR has been merged. If you think you need other functionality, rebase it on master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants