Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fs multiget #606

Merged
merged 11 commits into from
Sep 28, 2020
Merged

Fs multiget #606

merged 11 commits into from
Sep 28, 2020

Conversation

martindurant
Copy link
Member

@martindurant martindurant commented Sep 16, 2020

For async IO.

Read-only for now, and needs fsspec master

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • AppVeyor and Travis CI passes
  • Test coverage is 100% (Coveralls passes)

@martindurant martindurant marked this pull request as ready for review September 18, 2020 16:37
@martindurant
Copy link
Member Author

Thoughts?

@@ -18,6 +18,7 @@ pytest-cov==2.7.1
pytest-doctestplus==0.4.0
pytest-remotedata==0.3.2
h5py==2.10.0
s3fs==0.5.0; python_version > '3.6'
s3fs==0.5.1; python_version > '3.6'
git+https://github.com/intake/filesystem_spec; python_version > '3.6'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a package on PyPI or conda-forge that we could install?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, no.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a rough sense when these might be created?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When everything is good here, I can be sure to make a release before merging, so that this line can be reverted

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good @martindurant.

@alimanfoo
Copy link
Member

We're just reviewing this now in the zarr v3 dev call. Looks good to me :-)

@Carreau Carreau added this to the v2.5 milestone Sep 23, 2020
@alimanfoo alimanfoo mentioned this pull request Sep 24, 2020
@martindurant
Copy link
Member Author

@alimanfoo , can I get your +1 on this?

@martindurant
Copy link
Member Author

@jakirkham @joshmoore , any more thoughts?
I'd like to merge. Since this only affects fsspec storage, which is not used by anyone in the wold yet, I think this is safe.

@martindurant martindurant merged commit 610db34 into zarr-developers:master Sep 28, 2020
@martindurant martindurant deleted the fs_multiget branch September 28, 2020 14:06
@rabernat
Copy link
Contributor

rabernat commented Oct 2, 2020

Just catching up on this PR. Fantastic work @martindurant!

Are we planning to eventually do multiset as well?

@martindurant
Copy link
Member Author

Yes, of course. I have local code that is partway there.

@jakirkham
Copy link
Member

Sorry if this was already answered and I just missed it, was there an fsspec release? Asking as we are planning to do a Zarr release soonish.

cc @Carreau

@martindurant
Copy link
Member Author

Yes, the (test) requirement that went in with this PR is for the latest release fsspec. I don't know if we want to say anything about this functionality's dependency in the release notes.

@jakirkham
Copy link
Member

Good point. If there is some wording you'd like to see in the release notes, would suggest proposing it in PR ( #616 )

@martindurant
Copy link
Member Author

Hm, not sure whether to say use fsspec>=0.8.0 or to fallback to sequential

@rabernat rabernat mentioned this pull request Oct 3, 2020
@cgohlke
Copy link
Contributor

cgohlke commented Oct 7, 2020

Is the getitems method documented? Sorry if I missed something obvious, but I could not find it. I try to implement getitems for a TIFF store.

@martindurant
Copy link
Member Author

@cgohlke
Copy link
Contributor

cgohlke commented Oct 7, 2020

Thank you. I don't use fsspec but try to implement getitems independently. My understanding is that the following should work? Somehow it returns garbage

    def getitems(self, keys):
        return {key: self.__getitem__(key) for key in keys}

@martindurant
Copy link
Member Author

That looks OK to me, but that's what the current implementation is effectively doing anyway, with the loop outside of the storage class.

@jakirkham
Copy link
Member

I think we would still be open to a PR if someone wants to implement that directly in Zarr.

@martindurant
Copy link
Member Author

It would may be sensible to make a fallback implementation of getitems in the storage class, doing the same loop currently higher up; but that would require a strict class hierarchy and not actually make anything simpler overall.

@cgohlke
Copy link
Contributor

cgohlke commented Oct 7, 2020

False alarm, return {key: self.__getitem__(key) for key in keys} works as expected. I had forgotten to comment out a yield statement in the function that was left from previous tries to implement getitems as a generator...

@jakirkham
Copy link
Member

It would may be sensible to make a fallback implementation of getitems in the storage class, doing the same loop currently higher up; but that would require a strict class hierarchy and not actually make anything simpler overall.

I don't think that is strictly true. We already implement functions that have fallbacks (like rmdir) for storage classes that don't have them. So we could presumably do the same thing with getitems. The main benefit would be other functions in Group and Array could rely on getitems instead of getitem more generally.

That said, I think having a class hierarchy would be a nice improvement as long as we still have a way to coerce generic MutableMapping-based objects into something this hierarchy can work with (perhaps through a wrapper class). This mostly just reiterates what has come up in a few meeting now though.

@rabernat rabernat mentioned this pull request Nov 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants