Skip to content

.glob("**/filename") returns incorrect results #1380

Closed
@mariosasko

Description

@mariosasko

Reproducer (requires fsspec>=2023.9.0):

>>> import fsspec
>>> print(fsspec.filesystem("github", org="gpanders", repo="vim-medieval").glob("**/eval[-._ 0-9]*"))
['autoload/medieval.vim', 'doc/medieval.txt', 'ftplugin/markdown/medieval.vim']

**/ is converted to .* in the AbstactFileSystem.glob, so the result contains filenames that don't start with eval, making it incorrect.

This bug can only be reproduced on filesystems that don't resolve the glob path internally, as prefixing the path with <base_path>/ leads to /** taking precedence over **/ in the "glob to regex" conversion, which stops **/ from being replaced by .*.

Maybe the most robust solution would be to align the implementation with glob.glob (behaves as expected in this instance) to stop such inconsistencies from happening in the future. The non-posix behavior is error-prone, so dropping it should be fine, no? I'm happy to submit a PR if this sounds good to you (based on python/cpython#106703).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions