Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: from examples to tutorials #3564

Open
rabernat opened this issue Nov 22, 2019 · 14 comments
Open

DOC: from examples to tutorials #3564

rabernat opened this issue Nov 22, 2019 · 14 comments

Comments

@rabernat
Copy link
Contributor

It's awesome to see the work we did at Scipy2019 finally hit the live docs! Thanks @keewis and @dcherian for pushing it through.

Now that we have these more detailed, realistic examples, let's think about how we can take our documentation to the next level. I think we need TUTORIALS. The examples are a good start. I think we can build on these to create tutorials which walk through most of xarray's core features with a domain-specific datasets. We could have different tutorials for different fields. For example.

Each tutorial would cover the same core elements (loading data, indexing, aligning, grouping, computations, plotting, etc.), but using a familiar, real dataset, rather than the generic, made-up ones in our current docs.

Yes, this would be a lot of work, but I think it would have a huge impact. Just raising here for discussion.

xref #2980 #2378 #3131

@choldgraf
Copy link

In case it's helpful for inspiration, we took a similar approach with the MNE-Python package (neuro electrophysiology package):

https://mne.tools/stable/index.html

Maybe there are at least 3 levels in there, actually:

  • Examples - short vignettes that highlight one very specific piece of functionality, key-words for the example should be ctrl-fable in the title
  • Tutorials - in-depth guides through a common part of workflow that xarray wishes to enable, with more explanation and detail
  • Domain use-cases - examples of how xarray can facilitate use-cases in particular fields. Probably cover at a high-level many of the steps that multiple tutorials cover in-depth. More for "inspiration and buy-in" than in-depth learning.

Does that make sense?

@TomNicholas
Copy link
Member

@rabernat I'm going to be making a simple plasma physics-oriented xarray tutorial to give at a workshop next week.

I was wondering - if we're uploading real data for these, how big can/should the files be? It might affect what dataset I use.

@keewis
Copy link
Collaborator

keewis commented Dec 13, 2019

https://www.divio.com/blog/documentation/ might be a useful reference for this?

@rabernat
Copy link
Contributor Author

if we're uploading real data for these, how big can/should the files be? It might affect what dataset I use.

This is a good question. We need the tutorials to be able to run and build within a CI environment. That's the main constraint.

For larger datasets, rather than storing them in github, a good approach is to create an archive on https://zenodo.org/ from which the data can be pulled.

@pydata pydata deleted a comment from rabernat Dec 13, 2019
@TomNicholas
Copy link
Member

Maybe there are at least 3 levels in there, actually...

The article linked by @keewis is well worth reading in my opinion - it describes a similar breakdown of different types of documentation:

  • Tutorials - learning-oriented lessons to get newcomers started,
  • How-to guides - goal-oriented series of steps to solve a specific problem,
  • Explanation - understanding-oriented discussion providing background and context,
  • Reference - information-oriented description of technical machinery.

I think for xarray there is another type, like you suggest @choldgraf:

  • Domain use-cases (/inspiration/showing-off) - showcase-oriented examples of groups using xarray in anger to do something cool.

I personally think xarray in general has reference nailed, lots of good explanation, but is generally a bit weaker on tutorials and how-to guides, and doesn't have many examples of domain use-cases.


I have some ideas for how-to's (maybe these should all go in a separate issue?):

  • How to migrate from numpy to xarray - Huge numbers of numpy users need to shown exactly what code should be replaced with what, and what they can then stop worrying about.
  • How to apply your own analysis functions - i.e. apply_ufunc how-to. The existing documentation on that is more along the lines of an explanation in my opinion, and I've certainly found apply_ufunc to have a steep learning curve.
  • How to organise domain-specific functionality - In-depth guide to various tricks you can pull with accessors, and when you might want to go beyond that. The documentation we have on that only shows a couple of possible approaches.

We need the tutorials to be able to run and build within a CI environment.

So @rabernat for small datasets what might be an appropriate max filesize? I literally have no idea. ~1MB?

a good approach is to create an archive on https://zenodo.org/

I'll look into that.

@choldgraf
Copy link

choldgraf commented Dec 13, 2019

For larger datasets, rather than storing them in github, a good approach is to create an archive on zenodo.org from which the data can be pulled.

Another note from MNE - we have a "datasets" sub-module that knows how to pull a few datasets from various online repositories (and in different structures). These store in a local folder (by default, ~/mne_data I believe) and then they get fast-loaded after the first download. Many of the datasets are then stored in online repositories like OSF (https://osf.io/rxvq7/).

For datasets that aren't gigantic it's a pretty nice system. https://mne.tools/stable/overview/datasets_index.html?highlight=datasets

@apkrelling
Copy link
Contributor

Hello everyone, is this issue still relevant?
I could add a domain-use case for oceanography or meteorology, but it seems like that has already been done under

  • getting started -> examples -> ROMS Ocean Model Example
  • getting started -> examples -> Calculating Seasonal Averages from Time Series of Monthly Means
  1. So there's no need to work on domain-use cases for oceanography or meteorology, is that correct?

  2. Also, I'd be happy to contribute with something about how to migrate from numpy to xarray, if that is still needed.

@dcherian
Copy link
Contributor

dcherian commented Apr 2, 2021

Hi @apkrelling thanks for offering to help!

I think we can still add more domain-specific examples for meteorology and oceanography. @rabernat had some plans for this, maybe he can describe them.

how to migrate from numpy to xarray, if that is still needed.

This would be totally great!

@hafez-ahmad

This comment was marked as off-topic.

@dcherian

This comment was marked as off-topic.

@dcherian
Copy link
Contributor

We've started discussing how to reorganize the xarray-tutorial repository here: xarray-contrib/xarray-tutorial#53 . Comments are welcome!

@alimanfoo
Copy link
Contributor

Hi folks,

Just to mention that we've created a short tutorial on xarray which is meant as a gentle intro to folks coming from the malaria genetics field, who mostly have never heard of xarray before. We illustrate xarray first using outputs from a geostatistical model of how insecticide-treated bednets are used in Africa. We then give a couple of brief examples of how we use xarray for genomic data. There's video walkthroughs in French and English:

https://anopheles-genomic-surveillance.github.io/workshop-5/module-1-xarray.html

Please feel free to link to this in the xarray tutorial site if you'd like to :)

@ddjustina
Copy link

In case it's helpful for inspiration, we took a similar approach with the MNE-Python package (neuro electrophysiology package):

https://mne.tools/stable/index.html

Maybe there are at least 3 levels in there, actually:

* **Examples** - short vignettes that highlight one very specific piece of functionality, key-words for the example should be `ctrl-f`able in the title

* **Tutorials** - in-depth guides through a common part of workflow that xarray wishes to enable, with more explanation and detail

* **Domain use-cases** - examples of how xarray can facilitate use-cases in particular fields. Probably cover at a high-level many of the steps that multiple tutorials cover in-depth. More for "inspiration and buy-in" than in-depth learning.

Does that make sense?

@choldgraf seems like this page is down (https://predictablynoisy.com/xarray-explore-ieeg). Are these examples available elsewhere?

@choldgraf
Copy link

Oops I think the url just changed

https://chrisholdgraf.com/blog/2019/2019-10-22-xarray-neuro/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants