Skip to content

dataset.from_files always returns empty #1896

@Peter9192

Description

@Peter9192

Describe the bug
I tried out the new dataset facet search functionality following the example in https://github.com/ESMValGroup/ESMValCore/blob/main/notebooks/discovering-data.ipynb. However, I never seem to get any results. I tried with the same query, and also with CMIP5 instead of CMIP6.

Upon further investigation it looks like the problem for CMIP5 at least lies in the filtering out of identical facetsets. In my case, it finds 2 local files on my laptop, for which

facets = dict(file.facets)

returns an empty dict. There seem to be some files on ESGF that do not have a complete facetset either.

Therefore, the same checker:

def same(facets_a, facets_b):
"""Define when two sets of facets are the same."""
return facets_a.issubset(facets_b) or facets_b.issubset(facets_a)

will always see the empty (and otherwise the incomplete) set as a subset of every other set. This results in all files being filtered out.

Changing the same function from above to facets_a.issubset(facets_b) fixes the issue for me, but now it also returns incomplete facetsets which still have wildcards in them. Can we somehow require that the facetset must be complete?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions