-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
With DataQuery loading, in the face of ambiguity, refuse the temptation to guess #1721
Comments
This is difficult for me. While I've always disliked "magic" decisions in software, this one felt like a good thing when we first started doing it. You ask for MODIS Band 2 and you'll always get the highest spatial resolution version of band 2. Depending on what files you provide that could be 500m, could be 1km, could be 2km. With wavelength it is difficult, especially for SEVIRI, where HRV is the higher resolution but maybe has its own complications with geolocation and how users can use it. For example, using HRV instead of VIS008 when loading a composite that depends on 0.8um means you probably have to do some real resampling (not I like the idea of people being able to configure the behavior, but fear that it will end up with issues just like this one. We intend for behavior configuration to make things easier but now people are getting the "wrong" composite combination of products because the behavior picked a "worse" band when the composite author assumed the "better" band would be chosen. With that in mind, it kind of makes me think we should get rid of generic ...at least that's how I feel this instant. |
The complication with HRV isn't the resampling, but the completely different spectral behaviour so the products will not be even similar to each other if VIS008 is replaced with HRV. To lesser extent the different geographical coverage can also cause surprises.
In operations I've always specified the composites in instrument-specific YAML files using the channel names to have as little ambiguity as possible. For a non-operational user this might not be the easiest route as they would need to map the wavelengths to the channel names. But if it saves them from these surprises, I'd say it's a good tradeoff, even if the composites would be effectively duplicated for each sensor in use. |
I agree that ambiguity is not good, and also that all the builtin composites should use band names to avoid confusion for the known instruments. Is this niceness sufficient to introduce a possibly confusing guessing mechanism? I don't know. To be continued... |
If an instrument has a channel with a higher spatial resolution and a broader spectral band, and another channel with a lower spatial resolution but a narrower spectral band, then I don't know what users prefer or what should be used in an RGB that specifies only wavelength. An example is HRV, which is what triggered this discussion. There may be other instruments sharing this property. |
I agree. But I was also thinking about MODIS with multiple resolutions for the same channel. |
Feature Request
Is your feature request related to a problem? Please describe.
When a DataQuery matches multiple DataIDs, and I use the DataQuery to load a dataset, only one dataset is loaded. When multiple matching datasets are loaded and the DataQuery matches multiple matching datasets, selecting a key will guess which one the user means. For example:
Running this gives:
Satpy is guessing what I want. This violates the Zen of Python:
Describe the solution you'd like
I would like to force users to explicitly describe rules on what to do in face of ambiguity. There should be a global configuration that will be referred to whenever a user request is ambiguous. In this configuration, the user can configure preferences rules for resolution, wavelength, polarisation, or other keys. If the request is still ambiguous with this configuration, satpy should raise an exception.
The design of such a preferences configuration requires careful thought.
Describe any changes to existing user workflow
Perhaps the global configuration could default to the current behaviour, but considering recent discussions, I'm not sure if users even agree on what they would expect if loading 0.8 µm in SEVIRI, so maybe we should really force the user to be explicit (either in configuration or in call). The latter would break backward compatibility.
Additional context
Inspired by a discussion on slack.
The text was updated successfully, but these errors were encountered: