DOC: from examples to tutorials #3564

rabernat · 2019-11-22T17:30:14Z

It's awesome to see the work we did at Scipy2019 finally hit the live docs! Thanks @keewis and @dcherian for pushing it through.

Now that we have these more detailed, realistic examples, let's think about how we can take our documentation to the next level. I think we need TUTORIALS. The examples are a good start. I think we can build on these to create tutorials which walk through most of xarray's core features with a domain-specific datasets. We could have different tutorials for different fields. For example.

Xarray tutorial for meteorology / atmospheric science
Xarray tutorial for oceanography
Xarray tutorial for physics (whatever @fujiisoup and @TomNicholas do! 😉 )
Xarray tutorial for finance (whatever @max-sixty and @crusaderky do! 😉)
Xarray tutorial for neuroscience (see nice example from @choldgraf: https://predictablynoisy.com/xarray-explore-ieeg)

Each tutorial would cover the same core elements (loading data, indexing, aligning, grouping, computations, plotting, etc.), but using a familiar, real dataset, rather than the generic, made-up ones in our current docs.

Yes, this would be a lot of work, but I think it would have a huge impact. Just raising here for discussion.

xref #2980 #2378 #3131

choldgraf · 2019-11-22T18:04:50Z

In case it's helpful for inspiration, we took a similar approach with the MNE-Python package (neuro electrophysiology package):

https://mne.tools/stable/index.html

Maybe there are at least 3 levels in there, actually:

Examples - short vignettes that highlight one very specific piece of functionality, key-words for the example should be ctrl-fable in the title
Tutorials - in-depth guides through a common part of workflow that xarray wishes to enable, with more explanation and detail
Domain use-cases - examples of how xarray can facilitate use-cases in particular fields. Probably cover at a high-level many of the steps that multiple tutorials cover in-depth. More for "inspiration and buy-in" than in-depth learning.

Does that make sense?

TomNicholas · 2019-12-03T15:48:05Z

@rabernat I'm going to be making a simple plasma physics-oriented xarray tutorial to give at a workshop next week.

I was wondering - if we're uploading real data for these, how big can/should the files be? It might affect what dataset I use.

keewis · 2019-12-13T16:21:37Z

https://www.divio.com/blog/documentation/ might be a useful reference for this?

rabernat · 2019-12-13T16:50:45Z

if we're uploading real data for these, how big can/should the files be? It might affect what dataset I use.

This is a good question. We need the tutorials to be able to run and build within a CI environment. That's the main constraint.

For larger datasets, rather than storing them in github, a good approach is to create an archive on https://zenodo.org/ from which the data can be pulled.

TomNicholas · 2019-12-13T17:50:15Z

Maybe there are at least 3 levels in there, actually...

The article linked by @keewis is well worth reading in my opinion - it describes a similar breakdown of different types of documentation:

Tutorials - learning-oriented lessons to get newcomers started,
How-to guides - goal-oriented series of steps to solve a specific problem,
Explanation - understanding-oriented discussion providing background and context,
Reference - information-oriented description of technical machinery.

I think for xarray there is another type, like you suggest @choldgraf:

Domain use-cases (/inspiration/showing-off) - showcase-oriented examples of groups using xarray in anger to do something cool.

I personally think xarray in general has reference nailed, lots of good explanation, but is generally a bit weaker on tutorials and how-to guides, and doesn't have many examples of domain use-cases.

I have some ideas for how-to's (maybe these should all go in a separate issue?):

How to migrate from numpy to xarray - Huge numbers of numpy users need to shown exactly what code should be replaced with what, and what they can then stop worrying about.
How to apply your own analysis functions - i.e. apply_ufunc how-to. The existing documentation on that is more along the lines of an explanation in my opinion, and I've certainly found apply_ufunc to have a steep learning curve.
How to organise domain-specific functionality - In-depth guide to various tricks you can pull with accessors, and when you might want to go beyond that. The documentation we have on that only shows a couple of possible approaches.

We need the tutorials to be able to run and build within a CI environment.

So @rabernat for small datasets what might be an appropriate max filesize? I literally have no idea. ~1MB?

a good approach is to create an archive on https://zenodo.org/

I'll look into that.

choldgraf · 2019-12-13T20:57:01Z

For larger datasets, rather than storing them in github, a good approach is to create an archive on zenodo.org from which the data can be pulled.

Another note from MNE - we have a "datasets" sub-module that knows how to pull a few datasets from various online repositories (and in different structures). These store in a local folder (by default, ~/mne_data I believe) and then they get fast-loaded after the first download. Many of the datasets are then stored in online repositories like OSF (https://osf.io/rxvq7/).

For datasets that aren't gigantic it's a pretty nice system. https://mne.tools/stable/overview/datasets_index.html?highlight=datasets

apkrelling · 2021-04-01T22:33:57Z

Hello everyone, is this issue still relevant?
I could add a domain-use case for oceanography or meteorology, but it seems like that has already been done under

getting started -> examples -> ROMS Ocean Model Example
getting started -> examples -> Calculating Seasonal Averages from Time Series of Monthly Means

So there's no need to work on domain-use cases for oceanography or meteorology, is that correct?
Also, I'd be happy to contribute with something about how to migrate from numpy to xarray, if that is still needed.

dcherian · 2021-04-02T19:05:59Z

Hi @apkrelling thanks for offering to help!

I think we can still add more domain-specific examples for meteorology and oceanography. @rabernat had some plans for this, maybe he can describe them.

how to migrate from numpy to xarray, if that is still needed.

This would be totally great!

dcherian · 2022-04-26T15:39:16Z

We've started discussing how to reorganize the xarray-tutorial repository here: xarray-contrib/xarray-tutorial#53 . Comments are welcome!

alimanfoo · 2022-07-20T09:44:40Z

Hi folks,

Just to mention that we've created a short tutorial on xarray which is meant as a gentle intro to folks coming from the malaria genetics field, who mostly have never heard of xarray before. We illustrate xarray first using outputs from a geostatistical model of how insecticide-treated bednets are used in Africa. We then give a couple of brief examples of how we use xarray for genomic data. There's video walkthroughs in French and English:

https://anopheles-genomic-surveillance.github.io/workshop-5/module-1-xarray.html

Please feel free to link to this in the xarray tutorial site if you'd like to :)

ddjustina · 2023-02-21T19:18:26Z

In case it's helpful for inspiration, we took a similar approach with the MNE-Python package (neuro electrophysiology package):

https://mne.tools/stable/index.html

Maybe there are at least 3 levels in there, actually:
* **Examples** - short vignettes that highlight one very specific piece of functionality, key-words for the example should be `ctrl-f`able in the title

* **Tutorials** - in-depth guides through a common part of workflow that xarray wishes to enable, with more explanation and detail

* **Domain use-cases** - examples of how xarray can facilitate use-cases in particular fields. Probably cover at a high-level many of the steps that multiple tutorials cover in-depth. More for "inspiration and buy-in" than in-depth learning.
Does that make sense?

@choldgraf seems like this page is down (https://predictablynoisy.com/xarray-explore-ieeg). Are these examples available elsewhere?

choldgraf · 2023-02-21T20:01:04Z

Oops I think the url just changed

https://chrisholdgraf.com/blog/2019/2019-10-22-xarray-neuro/

rabernat added the topic-documentation label Nov 22, 2019

keewis mentioned this issue Dec 8, 2019

'Cookbook' page #1790

Open

pydata deleted a comment from rabernat Dec 13, 2019

This comment was marked as off-topic.

Sign in to view

scottyhq mentioned this issue Apr 25, 2022

Scipy2022 workshop and repository organization xarray-contrib/xarray-tutorial#53

Closed

dcherian mentioned this issue Jul 11, 2022

Explaining xarray in a single picture #6771

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: from examples to tutorials #3564

DOC: from examples to tutorials #3564

rabernat commented Nov 22, 2019

choldgraf commented Nov 22, 2019

TomNicholas commented Dec 3, 2019

keewis commented Dec 13, 2019

rabernat commented Dec 13, 2019

TomNicholas commented Dec 13, 2019

choldgraf commented Dec 13, 2019 •

edited

Loading

apkrelling commented Apr 1, 2021

dcherian commented Apr 2, 2021

This comment was marked as off-topic.

This comment was marked as off-topic.

dcherian commented Apr 26, 2022

alimanfoo commented Jul 20, 2022

ddjustina commented Feb 21, 2023

choldgraf commented Feb 21, 2023

DOC: from examples to tutorials #3564

DOC: from examples to tutorials #3564

Comments

rabernat commented Nov 22, 2019

choldgraf commented Nov 22, 2019

TomNicholas commented Dec 3, 2019

keewis commented Dec 13, 2019

rabernat commented Dec 13, 2019

TomNicholas commented Dec 13, 2019

choldgraf commented Dec 13, 2019 • edited Loading

apkrelling commented Apr 1, 2021

dcherian commented Apr 2, 2021

This comment was marked as off-topic.

This comment was marked as off-topic.

dcherian commented Apr 26, 2022

alimanfoo commented Jul 20, 2022

ddjustina commented Feb 21, 2023

choldgraf commented Feb 21, 2023

choldgraf commented Dec 13, 2019 •

edited

Loading