-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support cache mapper that is basename plus fixed number of parent directories #1318
Support cache mapper that is basename plus fixed number of parent directories #1318
Conversation
I've added type annotations to the lines I have changed so now self._strip_protocol = _strip_protocol line. I've added a |
There is no policy, typing is very limited in fsspec.
So long as it isn't a glob re special, I don't mind. I think re only uses it within |
I've changed it to |
To get this working on Windows I've used an explicit |
Everything is resolved here, except I leave it up to you whether to remove the eq/hash stuff now or not. |
OK, let's hold off the eq/hash changes to a separate PR as I need to think about it a bit more. |
This adds support for a cache mapper that is a fixed number of parent directories as well as the basename. This will be useful when using multiple directories containing files with the same names but you still want the cached filenames to be clearly identifiable rather than using hashes.
I have chosen to add a new
directory_levels
attribute toBasenameCacheMapper
rather than create a new class.Example use:
I have made some design decisions for this that are up for discussion:
same_names
kwarg toCachingFileSystem.__init__
for backward compatibility as well as adding the newcache_mapper
kwarg. You cannot specify both. This seemed a better approach than the alternative of allowing values forsame_names
other than boolean and allows us to easily support otherCacheMapper
classes in the future._^_
so far but not tied to this.