Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to build a documentation site with Jupytext? #247

Closed
mwouts opened this issue Jun 6, 2019 · 38 comments
Closed

How to build a documentation site with Jupytext? #247

mwouts opened this issue Jun 6, 2019 · 38 comments
Milestone

Comments

@mwouts
Copy link
Owner

mwouts commented Jun 6, 2019

The purpose of this issue is to document a few ways to build a documentation site. Our requirements are:

  • The source documents can be edited as Jupyter notebooks
  • Source files are text files (Markdown or others)
  • Documentation site can show the result of the code being executed, including plots.

We should explore at least

  • Jupyter Book (can we use it with Jupytext notebooks?)
  • Sphinx and Sphinx-gallery
@mwouts
Copy link
Owner Author

mwouts commented Jun 6, 2019

@psychemedia , you shared some experience with Jupyter Book + Jupytext at #230 and #231 already, maybe we could write a post on that, and a short wrap up at docs/examples.md

@mwouts
Copy link
Owner Author

mwouts commented Jun 6, 2019

In theory, Jupytext can also be used to edit Sphinx-galleries directly in Jupyter, cf. #80. And turning these galleries into a binder should just require adding jupytext to binder/requirements.txt.

@mwouts
Copy link
Owner Author

mwouts commented Jun 6, 2019

@emmanuelle, I see that you've created https://github.com/plotly/plotly-sphinx-gallery, would you like to give a try to Jupyter + Jupytext as an editor of sphinx-gallery scripts?

Also, I think that converting the Jupyter notebooks to Markdown files in https://github.com/plotly/documentation could make the repo much lighter. Would you be interested in giving a try to that? It should be as simple as:

jupytext --to md *.ipynb

and then, insert the opposite conversion in the process that builds the HTML files. You can use the --test argument to be sure that notebooks are not modified by the conversion (if required, we can improve the test mode).

Also, can you tell us how the Plotly web documentation is build from the notebooks?

@mwouts
Copy link
Owner Author

mwouts commented Jun 6, 2019

@choldgraf , you may be interested in this discussion.

I also have a question for you... do you think we could treat the Markdown documents in a Jupyter Book repository as Jupyter notebook using Jupytext?

That could require an additional operation when the book is build: convert the text notebooks (or only those that have a jupyter metadata?) to .ipynb files with no output, and then execute the notebook (#231). Can we add such a pre-processing step already?

@phaustin
Copy link
Contributor

phaustin commented Jun 6, 2019

@mwouts -- this is how nbsphinx handles non-ipynb files:

https://nbsphinx.readthedocs.io/en/0.4.1/custom-formats.html

@phaustin
Copy link
Contributor

phaustin commented Jun 6, 2019

Note on scaled images: we hit a rough spot with the jupyter -> rst conversion that pandoc uses. Specifically, jupyter notebooks don't render the markdown image tag for scaled images:

![alt text](https://github.com/favicon.ico){width="48"} doesn't work in a notebook. It does work with pandoc's rst converter, but pandoc's convert can't handle:

<img src="https://github.com/favicon.ico" width="48">, which does render correctly in a notebook.

As of last February, the only thing that works for both notebooks and sphinx rst files was IPython.display.Image

details at: spatialaudio/nbsphinx#284

hopefully at some point jupyter notebook's markdown parser will support scaled images, which would solve this issue.

@psychemedia
Copy link

I haven't had a chance to try this yet, but if Jupytext is set up to save .md into a docs/ folder in a git repo, and Github is set-up to render the contents of docs/ using Github Pages, does that provide a simplistic way of publishing rendered .md pages authored from within a notebook UI? (I'm not sure how things like image paths, etc, are handled, eg if the .md is save into a different path than the source .ipynb?

@mwouts
Copy link
Owner Author

mwouts commented Jun 7, 2019

@psychemedia , well I do have some experience with that! Take for example that README.md: it was 100% created in Jupyter, using Jupytext. And you can open it on binder as a notebook. I believe the same would work with .md files in the docs/ folder.

However, the .md files contains none of the outputs. Here we have two possible solutions:

@mwouts
Copy link
Owner Author

mwouts commented Jun 23, 2019

How to replace every .ipynb file by its .md counterpart - a pratical case study with the Python for Data Science Handbook is available at #263.

@mwouts
Copy link
Owner Author

mwouts commented Jun 24, 2019

Working on a practical case (like in #263) is possibly the best way to address this issue.

Would anybody want to suggest a repo in which a documentation site is made with Jupyter notebooks, that he/she'd like to transform to, say, Jupytext Markdown notebooks ?

I'd like to test at least

  • one Jupyter Book project
  • one nbsphinx project

Thanks!

@choldgraf
Copy link
Contributor

I'd be +1 on building in jupytext conversion to the Jupyter Book project somehow, just need to figure out the right pattern to do it. The problem w/ storing things as markdown files is that then you lose all the outputs, which are quite handy to have within the github repositories if you want people to be able to quickly glance at what is inside. Happy to brainstorm on this though.

@psychemedia
Copy link

One thing I've started exploring here is authoring in md and having dualled notebooks in a hidden .notebooks directory that is still renderable in Github?

I'm not sure if that moves things forwards any, but it can make things a bit tidier?

@choldgraf
Copy link
Contributor

@psychemedia please write up a post or something about this! I have been thinking of doing the same :-) I like the idea of keeping both a rendered ipynb format and a markdown format and using one or the other on GitHub depending on whether I wanna diff or wanna see the results in a notebook

@psychemedia
Copy link

Blog has died recently... too many half played with things not quite worked out for a post, and some actual work (late/missed deadline) in the way! But hopefully over w/e, or perhaps next week!

@choldgraf
Copy link
Contributor

Doesn't have to be a blog post :-) e.g. I've really enjoyed watching the "nbgitpuller + binder" discourse thread evolve over the months

@mwouts
Copy link
Owner Author

mwouts commented Jun 26, 2019

I like the idea of keeping both a rendered ipynb format and a markdown format and using one or the other on GitHub depending on whether I wanna diff or wanna see the results in a notebook

I completely agree, we should document one way to do that. But I would also like to document how to rebuild locally the full collection of .ipynb (#231) for the users who don't want to push the .ipynb to their repo. And also, if the reference format for the notebook is .md, we need to be sure that the Jupyter book site still works well (can we download the notebook/get interactivity...?)

One thing I've started exploring here is authoring in md and having dualled notebooks in a hidden .notebooks (...) I'm not sure if that moves things forwards any, but it can make things a bit tidier?

Certainly. In such a setting, it can be useful to configure git diff to only show the diffs on the .md file - see #251.

@psychemedia
Copy link

I completely agree, we should document one way to do that. But I would also like to document how to rebuild locally the full collection of .ipynb (#231) for the users who don't want to push the .ipynb to their repo.

A workflow I've started wondering is:

  • write in md in markdown folder;
  • dual notebooks in .notebooks;
  • generate book from .notebooks contents.

There is a complication perhaps if the structure of markdown contains subdirs markdown/chapter1, markdown/chapter2 in terms of setting up the dualling if a path is set as per c.ContentsManager.default_jupytext_formats = ".notebooks//ipynb,markdown//md. Also propagating any subdirectory structure into .notebooks?

(On the other hand, if I just set c.ContentsManager.default_jupytext_formats = ".notebooks//ipynb,md then are all .md files dualled?

@psychemedia
Copy link

psychemedia commented Jun 30, 2019

Via the Jupyter discourse site ( https://discourse.jupyter.org/t/binder-template-repositories/1522 ) I learn of the new Github template repositories, which support the definition of repos intended to act as template for other repos.

I wonder if this could be useful as a way of sharing Binderised repos that are pre-configured to support different jupytext mediated workflows? Eg I've started on a simple demo here: https://github.com/ouseful-template-repos/jupytext-md

@mwouts
Copy link
Owner Author

mwouts commented Jul 4, 2019

We've not discussed that yet: Jupytext can render Sphinx-Galleries on Binder! I think this is an interesting feature for the projects that offer examples in that form (notebook-like Python scripts).

Recently I've seen Jupytext being used at PlasmaPy to render their gallery. I am glad to see that our documentation was a good entry point - see the corresponding commit. Thank you @StanczakDominik!

@StanczakDominik
Copy link

🙇‍♂️ the pleasure is mine :) I'll have a blog post with instructions on how to do this up soon-ish (by tomorrow) - I'll link it back here.

@mwouts
Copy link
Owner Author

mwouts commented Jul 6, 2019

Oh that sounds great! We're looking forward to reading your post! Thanks

@StanczakDominik
Copy link

Good timing, as here it is:

https://stanczakdominik.github.io/posts/simple-binder-usage-with-sphinx-gallery-through-jupytext/

Most of this is simply using jupytext and its documentation, but there's an extension that lets you link those jupytext binder notebooks from sphinx gallery that may come in handy to anyone interested.

@psychemedia
Copy link

This is really interesting... can jupytext sphinx also be invoked from something like Circle CI to build docs automatically? (CI is yet another of those things I keep not making time to learn about:-(

@mwouts
Copy link
Owner Author

mwouts commented Jul 6, 2019

Well done @StanczakDominik ! Sure, the most difficult thing here is probably to have all these independent project interacting well together...

can jupytext sphinx also be invoked from something like Circle CI to build docs automatically?

Well I would say that this is already the case. Have a look at the PlasmaPy gallery: the plots are there, but I don't think that they are stored on GitHub, so they must have been produced when the documentation was built.

@StanczakDominik
Copy link

can jupytext sphinx also be invoked from something like Circle CI to build docs automatically?

Sure it can! You don't need jupytext to do that, either, unless you were storing your documentation in notebooks directly. That still has issues with keeping blobs of binary and JSON in git, which isn't at all handy.

@mwouts
Copy link
Owner Author

mwouts commented Jul 12, 2019

Another interesting example is the book Elegant Scipy, written by Juan Nunez-Iglesias (@jni), Harriet Dashnow (@hdashnow), and Stéfan van der Walt (@stefanv).

The book is stored in the form of Markdown files in the markdown folder. These markdown files are converted to notebooks using notedown and then rendered either on Binder, or as an HTML book.

I will have a look at how this would work with Jupytext. To start with, we'll have to implement the support for the .markdown extension (#288).

@mwouts
Copy link
Owner Author

mwouts commented Jul 17, 2019

As a follow-up on the previous comment, adding Jupytext>=1.2.0rc3 to the requirements of the Elegant Scipy book works was enough to render the book chapters as notebooks on binder.

In the process for building the book, we could also substitute notedown with jupytext (with all due respect to notedown, a project that preceded and inspired Jupytext), and replace

notedown --timeout 600 --match python --run ch5.markdown --output ch5.ipynb

with

jupytext --to ipynb --execute ch5.markdown

(jupytext does not have a timeout option - execute separately with nbconvert if a timeout is required).

@sotte
Copy link

sotte commented Sep 7, 2019

I'm just toying with the thought of using jupytext to integrate the examples (python script with the percent format) into my sphinx documentation. Maybe this "fresh" view is helpful, maybe not :)

I wish there was a simple directive (maybe via a sphinx extension?) that would allow me to add documents:

.. renderjupytext:: examples/my_example.py

The documents would automatically run during the build process.

@mwouts
Copy link
Owner Author

mwouts commented Sep 7, 2019

Hi @sotte , well I am not really a Sphinx expert, but I think you have a least three options:

  1. Use nbsphinx to compile these .py notebooks into Sphinx pages. Jupytext files are supported, see https://nbsphinx.readthedocs.io/en/0.4.1/custom-formats.html
  2. Convert your percent:py scripts to sphinx:py scripts (with jupytext in the command line), and then use these scripts with Sphinx gallery (please note first that cell metadata are not supported in the sphinx:py format)
  3. Or, use Jupyter-sphinx (maybe you will have to convert your .py scripts to .ipynb files first).

From what I have seen I think that option 1 is the most popular among Jupytext users. Still if you decide to give a try to second or third, please let us know the outcome.

@mwouts mwouts added this to the 1.3.0 milestone Sep 7, 2019
@sotte
Copy link

sotte commented Sep 8, 2019

@mwouts thanks!

I ended up with a Makefile that turns my python scripts/examples into the notebook format (.ipynb). nbsphinx is then used to integrate the notebook into sphinx.

Just using nbsphinx was not an option because it executed/rendered the file every time (and that might take a while), not just when the file was changed. Therefore the Makefile.

If somebody is interested, here is the relevant part of the Makefile

SRC_DIR := ../examples
DST_DIR := _examples
SRC_FILES := $(wildcard $(SRC_DIR)/*.py)
DST_FILES := $(patsubst $(SRC_DIR)/%.py,$(DST_DIR)/%.ipynb,$(SRC_FILES))


$(DST_DIR)/%.ipynb: $(SRC_DIR)/%.py
	jupytext --execute --to notebook -o $@ $<
	@echo

html: $(DST_FILES)
	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

@phaustin
Copy link
Contributor

phaustin commented Sep 10, 2019

I ended up with a Makefile that turns my python scripts/examples into the notebook format (.ipynb). nbsphinx is then used to integrate the notebook into sphinx.

Note that you can turn off the default nbsphinx execution:

https://nbsphinx.readthedocs.io/en/0.2.15/never-execute.html

@sotte
Copy link

sotte commented Sep 11, 2019

@phaustin thanks! You're right, I can turn it off, but I still need to transform the python script into the ipynb format. My Makefile only generates the ipynb (and executes it) if the source script changes. So using the Makefile kills two birds with one stone. And sphinx comes with a Makefile anyway.

@mwouts
Copy link
Owner Author

mwouts commented Sep 20, 2019

@StanczakDominik, @lesteve , the subject of how to best setup Binder to work with Sphinx-Gallery has reached the Jupyter Discourse site (here).

I think you will know better than me what is the issue with the default configuration of Sphinx-Gallery, so maybe you'd like to comment there? Also, @lesteve, would you like to share your experience with the Scikit-learn gallery?

@lesteve
Copy link
Contributor

lesteve commented Sep 20, 2019

Thanks @mwouts I commented on the Jupyter Discourse thread.

Also a small thing I noticed, not sure whether something can be done to configure JupyterLab to have the "Open With" -> Notebook as the default.

If you point to a specific .py file using urlpath in Binder:
https://mybinder.org/v2/gh/PlasmaPy/PlasmaPy/master?urlpath=lab/tree/plasmapy/examples/plot_magnetic_statics.py

It will use the Editor which is likely not what you want:
image

I am reasonably confident this is a JupyterLab-specific: it seems like the default content manager for .py file is still the editor. For example in you use the PlasmaPy binder link using JupyterLab:
https://mybinder.org/v2/gh/PlasmaPy/PlasmaPy/master?urlpath=lab

you have to right-click on a file (on the right-hand-side panel) and select "Open With" and then Notebook. Double-clicking on a file will open the editor.

@mwouts
Copy link
Owner Author

mwouts commented Sep 20, 2019

Thanks @lesteve for your comment at the Jupyter Discourse, that's an interesting input.

And thanks for reporting the above. That is correct, we don't know yet how to emulate the right-click on open as notebook in JupyterLab (#271).

The difference between JupyterLab and Jupyter Notebook is that the former looks not only at the document type reported by the contents manager ("notebook" for these documents), but also at the file extension. It's a bit harder to contribute a fix for this, as that has to be done in a JupyterLab extension, to be developped in TypeScript, which am I less familiar with. But sure, we will have to fix that at some point!

@mgeier
Copy link

mgeier commented Mar 13, 2020

@sotte wrote above (#247 (comment)):

Just using nbsphinx was not an option because it executed/rendered the file every time (and that might take a while), not just when the file was changed.

You should have told me, I wasn't aware of this problem until a few days ago!

I've tried to come up with a solution in spatialaudio/nbsphinx#408, please check it out!

@sotte
Copy link

sotte commented Mar 13, 2020

@sotte wrote above (#247 (comment)):

Just using nbsphinx was not an option because it executed/rendered the file every time (and that might take a while), not just when the file was changed.

You should have told me, I wasn't aware of this problem until a few days ago!

I've tried to come up with a solution in spatialaudio/nbsphinx#408, please check it out!

I'm actually pretty happy with my Makefile solution. Thank you though!

@mwouts
Copy link
Owner Author

mwouts commented Dec 9, 2021

I am going to close this old issue.

In the meanwhile we have seen Jupyter Book emerging, I think that is a great candidate for building documentation sites.

Regarding Jupytext we have recently solved two related issues:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants