Windows host handling#25
Conversation
| else | ||
| { | ||
| // Fix windows paths | ||
| if (isset($result['path']) && preg_match("#^(\w:)#", $result['path'])) |
There was a problem hiding this comment.
I worry that this change is a bit too aggressive. Do we really want to add a slash to every single uri that doesn't start with a slash? Probably not.
So if this is the case for file:// uris (I think this is true) then this can be simplified significantly with:
if (isset($result['scheme']) && $result['scheme']==='file' && $result['path'][0]!=='/') {
$result['path'] = '/' + $result['path'];
}Aside from that tweak, I would also say that it can use a better comment explaining exactly why this needs to be done, because I'm worried fix windows paths might not make sense to a developer maintaining this.
|
Hi, thanks for your feedback. Yes, I think it can be simplified, but it doesn't add a slash to every file uri which doesn't start with a slash: It adds only a slash if it matches \w: lile "C:", etc. I would even expand this to match \w:/ ("C:") if ( isset($result['scheme']) && $result['scheme']==='file' &&
isset($result['path']) && preg_match("#^(\w:)\/#", $result['path']) ) {
$result['path'] = '/' + $result['path'];
$result['host'] = '';
}or without regex: if ( isset($result['scheme']) && $result['scheme']==='file' &&
isset($result['path']) && ctype_alpha($result['path'][0]) && $result['path'][1] ===':' && $result['path'][2] === '/' ) {
$result['path'] = '/' + $result['path'];
$result['host'] = '';
}Which version do you think is better? Sorry for the bad comment. I'll be more descriptive when I update the PR. |
12a44c7 to
e661fc2
Compare
|
Hi, I just updated the PR with the discussed changes |
|
Sorry for the slow responses here, but I'm down with merging this once the conflicts are resolved. Sorry for being over a year late, but thank you for making this change. |
|
tried to resolve the merge-conflict... lets see |
| { | ||
| // Add empty host and trailing slash to windows file paths (file:///C:/path) | ||
| if (isset($result['scheme']) && $result['scheme'] === 'file' && isset($result['path']) && | ||
| preg_match('/^(?<windows_path> [a-zA-Z]:(\/(?![\/])|\\\\)[^?]*)$/x', $result['path'])) { |
There was a problem hiding this comment.
The regex also matches something like C:\\abc\def.txt and C:\\abc/def.txt - as well as C:/abc/def.txt
In what situations does C:\\ happen?
@peterpostmann @staabm @evert - I have rebased this in #71 and want to make sure I understand all the combinations and why they need to be handled. (And I will write some comment lines in the code in that PR, to help future devs who have forgotten all the detail of Windows paths)
There was a problem hiding this comment.
I don't really have the context anymore to provide useful input im afraid =(
There was a problem hiding this comment.
https://docs.microsoft.com/en-us/dotnet/standard/io/file-path-formats is a place where I can find C:\\ ;
Console.WriteLine("Setting current directory to 'C:\\'");
But that is C# code, and the \\ is just to put a literal \ in the string. The actual string is:
Setting current directory to 'C:\'
So that does not give me an example of how code can get here and have C:\\ in the string, and be valid representing something in the MS world.
And a tried a few combinations of strings to send to parse_url and the returned value in "path" has exactly whatever \ characters are provided in the input string - there is no "magic escaping" going on in the returned "path" element. So I still don't understand why we need to have the check in the regex to match things like C:\\
There was a problem hiding this comment.
Parsing file:///C:\\abc\\def.txt is something you probably want to do to process path strings with escaped backslashes. But revisiting, I would simplify the logic and remove the restrictions. I think the idea was to only process C:\ (valid path), C:\\ (valid escaped path), C:/ (valid path with forward slashes), but not C:// because this is not a valid Windows path anyways. But since the fallback is parsing C:// nevertheless, we should do the same to avoid inconstancy.
// Add empty host and trailing slash to windows file paths (file:///C:/path)
if (isset($result['scheme']) && $result['scheme'] === 'file' && isset($result['path']) &&
preg_match('/^(?<windows_path> [a-zA-Z]:(\/|\\\\).*)$/x', $result['path'])) {
$result['path'] = '/' . $result['path'];
$result['host'] = '';
}This will just look for a single letter followed by a colon and a slash, which is consistent with the fallback
=== file:///C:\abc\def.txt:
old parse: scheme: file, path: C:\abc\def.txt
fallback: scheme: file, path: /C:\abc\def.txt
new parse: scheme: file, path: /C:\abc\def.txt
=== file:///C:\abc/def.txt:
old parse: scheme: file, path: C:\abc/def.txt
fallback: scheme: file, path: /C:\abc/def.txt
new parse: scheme: file, path: /C:\abc/def.txt
=== file:///C:/abc/def.txt:
old parse: scheme: file, path: C:/abc/def.txt
fallback: scheme: file, path: /C:/abc/def.txt
new parse: scheme: file, path: /C:/abc/def.txt
=== file:///C://abc/def.txt:
old parse: scheme: file, path: C://abc/def.txt
fallback: scheme: file, path: /C://abc/def.txt
new parse: scheme: file, path: /C://abc/def.txt
There was a problem hiding this comment.
@peterpostmann I implemented this kind of thing now in #71 - that seems to work fine.
|
Rebased and adjusted code and merged in PR #71 - thanks @peterpostmann |
Hi,
I think a found a bug on windows. For the uri
'file:///C:/path/file_a.ext'parsereturnsThis causes
resolve('file:///C:/path/file_a.ext', 'file_b.ext')to generate wrong URIs:file://C:/path/file_b.ext(one slash is missing, because the empty host is missing).parse_urlproduces the same result, which is correct with reagards to http://php.net/manual/en/wrappers.file.php, but it cannot be used to reconstruct file uris directly.The behavior of
_parse_fallbackis correct and has this output:parseshould also produce this output.Note: I had to remove the white spaces before and after the regex because I couldn't get it tested otherwise (using PHP 7.1)