Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add comment on compat="override" needed for sea ice variable in Intake Tutorial #460

Open
adele-morrison opened this issue Sep 19, 2024 · 10 comments

Comments

@adele-morrison
Copy link
Collaborator

Apparently compat="override" is needed when opening sea ice data using Intake to deal with CICE coordinates better. It would be useful to add a comment into ACCESS-NRI_Intake_Catalog.ipynb to explain this.

@navidcy
Copy link
Collaborator

navidcy commented Sep 19, 2024

@anton-seaice could you help with that?

@anton-seaice
Copy link
Collaborator

@adele-morrison
Copy link
Collaborator Author

Yeah I guess I was thinking under that section (or elsewhere), we could explicitly mention that this should (always?) be used for sea ice variables.

@Thomas-Moore-Creative
Copy link
Collaborator

@adele-morrison , @anton-seaice , @navidcy et al.

Collectively we're understanding the importance of problem-specific or dataset-specific xarray_kwarg settings. Can we crowdsource best practice by adding a single "kitchen sink" json file for all tweaks and settings?

Is a "good enough" next step:

  • including and documenting all the kwarg options inside recipe code examples, with defaults of None
  • having some master COSIMA settings json file that stores recommended kwarg settings based on keys like model name, etc?
  • including querying this json file in all example code workflows?

Nothing fancy but could be maintained by the community?

Thoughts on this?

@navidcy
Copy link
Collaborator

navidcy commented Oct 17, 2024

It sounds good but I don't understand tbh what it involves, how will it work, how fragile would be, and what would involve from the users side.

@navidcy
Copy link
Collaborator

navidcy commented Oct 17, 2024

There is some coverage at

https://cosima-recipes.readthedocs.io/en/latest/Tutorials/ACCESS-NRI_Intake_Catalog.html#1.-Speeding-up-opening-your-datasets

Do we want it more explicit?

@anton-seaice at the moment this is under the section "Speeding up" which doesn't sound imperative for users to do. But @adele-morrison is implying that this has to be done. If that's so, then let's write it explicitly somewhere else also?

Have I understood correctly?

(Even if adding the compat="override" is not that it "has to be done" but it does help 99.99% of the times, then it's good to explicitly say this to users as a "rule" so that users struggle less.)

@anton-seaice
Copy link
Collaborator

Yes correct.

Its might be good to encourage teaching of principles rather than rules, as we all look at data that isn't access-OM2. It's not totally risk free to just always use these keywords incase the data you are loading is not well curated and it ends up stopping xarray doing checks that would give a useful warning. Also, hopefully this will problem will get handled better in CICE6/OM3 output.

@navidcy
Copy link
Collaborator

navidcy commented Oct 17, 2024

True true!

@Thomas-Moore-Creative
Copy link
Collaborator

Its might be good to encourage teaching of principles rather than rules

Can it be both? "teaching principles" in tutorial notebooks but abstracting the details out of the way in well documented functions that take "rule" based settings from curated, community-built config files?

some_random_config.yaml ( that does not directly address compat="override" kwarg )

catalog_search_query_dict:
  ACCESS_ESM15:
    all_ocean:
      realm: ['ocean','ocnBgchem']
      source_id: 'ACCESS-ESM1-5'
    MY_PROJECT:
      experiment_id: ['historical','piControl','ssp126','ssp370','ssp585']
      source_id: 'ACCESS-ESM1-5'
      variable_id: ['intpp','thetao']
      realm: ['ocean','ocnBgchem']
      frequency: 'mon'
      file_type: 'l'
chunking:
  ACCESS_ESM15_2D: #{'chunks':{'member':1,'time':220,'j':300,'i':360}}
    chunks:
      member: 1
      time: 220
      i: 360
      j: 300
  ACCESS_ESM15_3D: #{'chunks':{'member':?,'time':?,'lev':-1,'j':-1,'i':-1}}
    chunks:
      member: 1
      time: 12
      lev: -1
      i: -1
      j: -1

@Thomas-Moore-Creative
Copy link
Collaborator

It sounds good but I don't understand tbh what it involves, how will it work, how fragile would be, and what would involve from the users side.

I'm wondering out loud here partly to have others tell me I'm pointed in the wrong direction (or not).

IMO this could involve:

  • tutorial notebooks that teach the principles, exposes / explains all the kwargs, and documents new utility functions as an appendix.
  • combine these functions in a package ( perhaps COSIMA or ACCESS-NRI repo ?).
  • the functions abstract away the detail and take keys as inputs.
  • keys load settings from a single (or set) of YAML config files which are community curated and managed by more expert folks.
  • option for developing more complex heuristics for automatic settings.

Users could choose to:

  • continue to adapt tutorial code, cell-by-cell and cut & paste.
  • use the functions in their own code and notebooks.
  • get skilled up and join in maintaining the package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants