Fix Windows Absolute Path Parsing and Remove HTTP Assumption #1837

mre · 2025-09-03T22:34:31Z

This commit fixes issue #972 where Windows absolute paths like C:\path were incorrectly parsed as URLs with scheme C:.

I also took that as an opportunity to finaly remove the assumption that non-existent inputs get parsed as HTTP URLs. So an input foo would previously have been automatically converted to http://foo/, which is a bold and mostly incorrect presumption. You can read more about it in the issue here.

Key changes in this PR:

Added WindowsPath newtype with proper detection using pattern matching
Moved Windows path logic to separate submodule for better organization
Removed automatic HTTP assumption (foo -> http://foo/)
Added InvalidInput error type with helpful error messages
Updated all tests to reflect new behavior

Fixes #972 and #1595

katrinafyi · 2025-09-08T00:23:00Z

I have quick comments. The first, and easiest, is that lowercase drive letters should be permitted. The second is that I think that the Unix and Windows logic should be unified so Windows also gains helpful hints like the full URL suggestion. The third is that (imo) it would be preferable to avoid custom WindowsPath parsing and logic. Windows paths are notoriously complicated. Is there a reason why using PathBuf on Windows is not sufficient? Is there a need to recognise windows paths on Unix platforms?

It would also be nice if there was a CI job to run windows tests.

mre · 2025-09-08T13:06:14Z

Thanks for the feedback; I'll look into it.

mre · 2025-10-09T08:24:18Z

Sorry for dropping the ball on this. Just a reminder what needs to be done so that I can quickly jump back in when I find the time:

rebase
support lowercase Windows paths
Try to unify Unix and Windows logic
Try to use PathBuf for Windows instead of WindowsPath

This commit fixes issue #972 where Windows absolute paths like C:\path were incorrectly parsed as URLs with scheme C:. Key changes: - Added WindowsPath newtype with proper detection using pattern matching - Moved Windows path logic to separate submodule for better organization - Removed automatic HTTP assumption (foo -> http://foo/) - Added InvalidInput error type with helpful error messages - Updated all tests to reflect new behavior Fixes #972

- Replace From<WindowsPath> for PathBuf with as_path() method for better API - Move WindowsPath unit tests to the windows_path module for better organization - Fix unused import warning

Update test expectation to match new error message format. The error now occurs during file content reading rather than input parsing, which is the correct behavior for relative paths.

Address feedback by supporting lowercase Windows paths, refactoring to use PathBuf's is_absolute() instead of custom WindowsPath parsing, unifying error messages across platforms, and adding a Windows CI test job

set

Okay, this is slightly annoying. There are a lot of edge-cases with input validation (who would have guessed?). I hope to have reached an agreeable middle-ground by dividing validation into an early-stage if the inputs are clearly file-paths and a deferred late-stage where they are not (but might be). Not sure if we can do any better at this point. This also means that one test for an non-existent file falls into the early-validation category now and returns a better (I think) error message. That's lovely.

katrinafyi · 2025-12-17T02:10:40Z

lychee-lib/src/types/input/input.rs

                    }
                    return;
                }
                InputSource::FsPath(ref path) => {


if you tack a predicate onto the case, it can avoid indenting the entire match arm.

Suggested change

InputSource::FsPath(ref path) => {

InputSource::FsPath(ref path) if !skip_missing => {

katrinafyi · 2025-12-17T02:31:43Z

lychee-lib/src/types/error.rs

+    /// The given input is neither a valid file path nor a valid URL
+    #[error("{0}")]
+    InvalidInput(String),


the doc comment here already prescribes the function of this error, so the error should be made less flexible. i.e., the message "Input '{input}' not found as file and not a valid URL. Use full URL (e.g., https://example.com) or check file path." should be moved into [#error(...)] and the error's data should be input rather than any error message.

or, the doc comment can be made more general.

katrinafyi · 2025-12-17T02:41:19Z

lychee-lib/src/types/input/source.rs

        }

-        // We use [`reqwest::Url::parse`] because it catches some other edge cases that [`http::Request:builder`] does not
+        // Detect drive-letter paths with `Path::is_absolute()` This handles


I'm really sorry to do this to you but.. why is this so complicated? I was hoping that removing the HTTP assumption would make it shorter, not longer.

I think that having a heuristic for things which are unambiguous "enough" to bypass skip-missing is weird and unexpected. The --skip-missing help would have to be changed to reflect this, but it's really hard to explain to the user (what counts as ambiguous? it's completely arbitrary).

It could be simplified a lot by changing --skip-missing slightly:

If a command-line input is not-URL not-glob and doesn't exist, it will always be reported as an error. This avoids needing to decide if it's unambiguous enough.

skip-missing only applies to "discovered" inputs, such as permission denied while traversing directories or cannot read file.

With this change, the code would look something like this.

No worries. Thanks for the feedback. Your solution looks very clean. I will look into integrating it into my PR.

.github/workflows/ci.yml

This is based on the awesome feedback by katarinafyi

This commit removes the restriction that only HTTP and HTTPS URLs are accepted as input sources. Now any URL with a scheme longer than one character is accepted, following the reviewer's suggestion for better future compatibility. The previous approach was overly restrictive in the input parsing stage, rejecting schemes like ftp://, file://, and mailto: as invalid file paths. However, the existing filter infrastructure already supports arbitrary schemes through the --scheme flag, which accepts all schemes by default when no specific schemes are configured. By accepting all URL schemes at the input parsing level, we enable better extensibility. When lychee gains support for new protocols like FTP or enhanced file:// handling, existing URLs with those schemes will automatically work without requiring changes to the input parser. The change also aligns with the original design where scheme filtering happens at the checking stage rather than the parsing stage. Users can still restrict which schemes to check using the existing --scheme CLI option. Updated the test suite to reflect the new behavior, replacing the test that expected scheme rejection with one that verifies scheme acceptance for future compatibility.

mre force-pushed the absolute-local-windows-paths branch 5 times, most recently from ac066ce to c1d2eee Compare September 4, 2025 10:39

This was referenced Sep 6, 2025

Absolute local paths on Windows not recognized #972

Open

Do not assume input to be a URL if the local path doesn't exist #1595

Open

mre added the triage label Sep 23, 2025

thomas-zahner force-pushed the master branch 2 times, most recently from fcdf77c to e0912ab Compare October 21, 2025 12:53

mre added 6 commits December 16, 2025 10:50

Refactor WindowsPath implementation and move tests

86a9738

- Replace From<WindowsPath> for PathBuf with as_path() method for better API - Move WindowsPath unit tests to the windows_path module for better organization - Fix unused import warning

Format and add link to issue

f7f8c46

Update test for inputs without scheme (example.com)

eed6d90

Fix test_handle_nonexistent_relative_paths_as_input test

6e72c3b

Update test expectation to match new error message format. The error now occurs during file content reading rather than input parsing, which is the correct behavior for relative paths.

Address PR feedback

07b569a

Address feedback by supporting lowercase Windows paths, refactoring to use PathBuf's is_absolute() instead of custom WindowsPath parsing, unifying error messages across platforms, and adding a Windows CI test job

mre force-pushed the absolute-local-windows-paths branch from c1d2eee to 07b569a Compare December 16, 2025 11:14

mre added 7 commits December 16, 2025 13:26

Fix clippy lint

b715c7e

Try to use vendored OpenSSL

f8377f8

Respect skip-missing to avoid failing on missing files if the flag is

b0309e1

set

Ignore failing help text test on Windows

d0c9e53

fix import

a40aa2d

fix relative path handling

38d1240

katrinafyi reviewed Dec 17, 2025

View reviewed changes

mre added 2 commits December 18, 2025 16:23

Simplify input handling

c8fb0a1

This is based on the awesome feedback by katarinafyi

Add (hopefully helpful) comment

d6f1cc6

mre added 3 commits December 18, 2025 16:52

formatting

ba49924

Fix test

f4520b3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix Windows Absolute Path Parsing and Remove HTTP Assumption #1837

Fix Windows Absolute Path Parsing and Remove HTTP Assumption #1837

mre commented Sep 3, 2025

Uh oh!

katrinafyi commented Sep 8, 2025

Uh oh!

mre commented Sep 8, 2025

Uh oh!

mre commented Oct 9, 2025

Uh oh!

katrinafyi Dec 17, 2025

Uh oh!

mre Dec 17, 2025

Uh oh!

katrinafyi Dec 17, 2025

Uh oh!

katrinafyi Dec 17, 2025

Uh oh!

mre Dec 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	InputSource::FsPath(ref path) => {
	InputSource::FsPath(ref path) if !skip_missing => {

Uh oh!

Fix Windows Absolute Path Parsing and Remove HTTP Assumption #1837

Are you sure you want to change the base?

Fix Windows Absolute Path Parsing and Remove HTTP Assumption #1837

Conversation

mre commented Sep 3, 2025

Uh oh!

katrinafyi commented Sep 8, 2025

Uh oh!

mre commented Sep 8, 2025

Uh oh!

mre commented Oct 9, 2025

Uh oh!

katrinafyi Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

mre Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

katrinafyi Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

katrinafyi Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

mre Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants