Skip to content

Fix resolve_pattern for local symlinked files#7973

Open
KOKOSde wants to merge 1 commit intohuggingface:mainfrom
KOKOSde:fix/symlink-glob-local-files
Open

Fix resolve_pattern for local symlinked files#7973
KOKOSde wants to merge 1 commit intohuggingface:mainfrom
KOKOSde:fix/symlink-glob-local-files

Conversation

@KOKOSde
Copy link

@KOKOSde KOKOSde commented Jan 31, 2026

Fix resolve_pattern for local symlinked files.

Problem: on the local file:// filesystem, fsspec can report symlinks as type=="other" and omit the islink flag, so symlinked files are skipped.

Fix: when protocol=="file", treat os.path.islink(filepath) as a link candidate and include it if it resolves to a regular file.

Includes a regression test in tests/test_data_files.py.

@KOKOSde KOKOSde force-pushed the fix/symlink-glob-local-files branch from 62505f8 to c70d132 Compare February 4, 2026 00:07
Some fsspec versions report local symlinks as type='other' and may omit
the 'islink' flag. This caused symlinked parquet files (common in HF cache
layouts with blob storage) to be excluded from pattern matching.

Changes:
- Explicitly check os.path.islink() for local file protocol
- Add regression test that forces the type='other' scenario

Fixes huggingface#7084
@KOKOSde KOKOSde force-pushed the fix/symlink-glob-local-files branch from c70d132 to fefb65d Compare February 5, 2026 05:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant