-
Notifications
You must be signed in to change notification settings - Fork 36
Reads multiple zarr files #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…`xarray.open_mfdataset`` is implemented
…for reading multiple files
…nly one file to open, untouched
…en opening zarr file
…eading multiple files in parallel
…open_dataset will now open zarr files
|
@Mikejmnez : have you seen pydata/xarray#4461 ? This solved the same sort of problem here, but at the higher level and with less code. We can definitely include this, at least until the xarray path is concrete. Datashape is no longer referenced in the intake main package, since it was always |
|
@martindurant I haven't seen it, thanks for pointing it out to me. Just to understand, with the new (high level) implementations, should url be passed as is into |
|
Exactly, xarray should handle all the cases via open_dataset or open_mfdataset. I'm not sure that there is a plan to detect that a URL is glob-like, though. |
Closes #70
There were a couple of permutations with the new added capabilities in xarray. For now, this added code makes it possible for intake to:
When a single zarr file needs to be read, it still uses
xr.open_zarr, as before, and the get_mapper option for theurl_pathas before. Another option is to usexr.open_datasetand usefspec.open_local, as it does in thenetcdfcase. I don't know enough to justify choosing between one or the other (xr.open_zarris not being deprecated anymore), I just decided to still usexr.open_zarrfor simplicity (no code change there).When multiple zarr files, a glob-path like
/directoryA/subdir*/subsub*/*can be passed directly toxr.open_mfdataset. In such case, the code makes sure thatengine='zarr'is passed as an argument.In both cases,
chunk='auto'is set as the default.