Description
Statement of the problem
Currently, basename
and dirname
have this documented, yet odd, behaviour:
julia> basename("/tmp/bar")
"bar"
julia> basename("/tmp/bar/")
""
julia> dirname("/tmp/bar")
"/tmp"
julia> dirname("/tmp/bar/")
"/tmp/bar"
What these functions do when there is a trailing path separator doesn't make any sense. For example, the Single Unix Specification, §3.170 doesn't allow empty files names, which are defined as follow:
A sequence of bytes consisting of 1 to
{NAME_MAX}
bytes used to name a file. The bytes composing the name shall not contain the<NUL>
or<slash>
characters.
Additionally, this doesn't match what the Unix utilities with the same names do:
% basename /tmp/bar
bar
% basename /tmp/bar/
bar
% dirname /tmp/bar
/tmp
% dirname /tmp/bar/
/tmp
I was told that this behaviour was borrowed from Python's os.path.basename
and os.path.dirname
, but this doesn't change the fact it's meaningless.
This also makes it harder than necessary to automatically handle paths received from other functions, which may or may not have a trailing path separator, so if you are aware of basename
and dirname
oddness you have to do checks like
Line 235 in 85f4db2
and a similar one is used in
BinaryBuilder
as well.
Solution
My suggestion is to make basename
and dirname
ignore trailing path separators.
This was discussed already in #33000 and resolved by explicitly documenting the current behaviour in #37580. Here I'd like to ask to consider making the breaking change for Julia v2.0, basically what #33021 tried to do. I'd argue that no one in their sane mind should rely on basename
reporting an empty string: if they want to know whether a path ends with a path separator, that's the job of isdirpath
. Therefore, my expectation is that this change will cause very little breakage in practice.