-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added FAQ #274
Added FAQ #274
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,103 @@ | ||
# Frequently Asked Questions | ||
|
||
## What is Jupytext? | ||
|
||
Jupytext is a Python package that provides _two-way_ conversion between Jupyter Notebooks and several other text-based formats like Markdown documents or scripts. | ||
|
||
## Why would I want to convert my notebooks to text? | ||
|
||
The text representation have much cleaner diffs than the original notebook format. Merging multiple contributions to a notebook in any of these text formats is easier than with the JSON format. Last but not least, acting on a notebook represented as text (spell check, reformat, ...) is sometimes more comfortable than in Jupyter. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One of the use cases I have for an md only version (no ipynb save) is a notebook where I am using database calls that display sensitive information as outputs, even if the db queries and code manipulations I do later on returned data are not sensitive. By making sure I don't save the notebook, the fact that the saved md document does not contain code outputs (sensitive data) is a win. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting use case! We could add a mention that only the inputs are saved, that they match well what the user has effectively contributed to the notebook, and that as you say that's an effective way to drop the outputs which can be large or private. |
||
|
||
## How do I use Jupytext? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO here you should just link to the "using jupytext" section of the documentation. If that documentation is too wordy and complex, then I'd add this short example as a quick "getting started" section at the top of the "using jupytext" section, rather than in the FAQ |
||
|
||
Open the notebook that you want to version control. _Pair_ the notebook to a script or a Markdown file using either the [Jupytext Menu](https://github.com/mwouts/jupytext/blob/master/README.md#jupytext-menu-in-jupyter-notebook) in Jupyter Notebook or the [Jupytext Commands](https://github.com/mwouts/jupytext/blob/master/README.md#jupytext-commands-in-jupyterlab) in JupyterLab. | ||
|
||
Save the notebook, and you get two copies of the notebook: the original `*.ipynb` file, together with its paired text representation. | ||
|
||
Read more about how to use Jupytext in the [documentation](using-server.md). | ||
|
||
## Which Jupytext format do you recommend? | ||
|
||
Notebooks that contain more text than code are best represented as Markdown documents. These are conveniently edited in IDEs and are also well rendered on GitHub. | ||
|
||
Saving notebooks as scripts is an appropriate choice when you want to act on the code (refactor the code, import it in another script or notebook, etc). Use the `percent` format if you prefer to get explicit cell markers (compatible with VScode, PyCharm, Spyder, Hydrogen...). And if you prefer to get the minimal amount of cell markers, go for the `light` format. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is an aside — sorry; not sure when I'll get a chance to comment again — but maybe pertinent to the sentiment of this section: I wonder if Jupytext support for editing py files as notebooks may actually help address the issue of notebooks being an inappropriate medium for editing module files in an ad hoc development process with a judicious extension or two, eg to support code execution form NBFormat cells? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've answered your question there, maybe that's a point we'd like to see in this FAQ? |
||
|
||
## Can I see a sample of each format? | ||
|
||
Go to [our demo folder](https://github.com/mwouts/jupytext/tree/master/demo) and see how our sample `World population` notebook is represented in each format. | ||
|
||
## Can I edit the paired text file? | ||
|
||
Yes! When you're done, reload the notebook in Jupyter. There, you will see the updated input cells combined with the matching output cells from the `.ipynb` file. | ||
|
||
## Do I need to close my notebook in Jupyter? | ||
|
||
No, you don't (*). You can edit the paired text file and simply refresh your navigator to reload the updated input cells. When you refresh the notebook, the kernel variables are preserved, so you can continue your work where you left it. | ||
|
||
(*) Please read about Jupyter's autosave below. | ||
|
||
## How do paired notebooks work? | ||
|
||
The `.ipynb` file contains the full notebook. The paired text file only contains the input cells and selected metadata. When the notebook is loaded by Jupyter, input cells are loaded from the text file, while the output cells and the filtered metadata are restored using the `.ipynb` file. When the notebook is saved in Jupyter, the two files are updated to match the current content of the notebook. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can click on either file to open it into the notebook editor, edit it and run it there, and when you save it, both files will be updated using the appropriate file format. ?What are likely problems if you have both files open at the same time? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Exactly the same as if you are editing the same document in two editors. In short, as long as you modify just one of the two documents you're safe. You may find the autosave a bit annoying, but it won't hurt as Jupytext implements timestamps checks. Read more on this in the next Q/A. |
||
|
||
## Can I create a notebook from a text file? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again for content like these, I'd rather make sure there is an explicit section in the "using jupytext" documentation and then link to that from the FAQ... |
||
|
||
Certainly. Open your pre-existing scripts or Markdown files as notebooks with a click in Jupyter Notebook, and with the _Open as Notebook_ menu in JupyterLab. | ||
|
||
The text formats do not store the output cells. If you want to preserve these when you refresh the notebook, be sure to pair the text file to an `.ipynb` file. | ||
|
||
If you want to convert text formats to notebooks programmatically, use one of | ||
```bash | ||
jupytext --to ipynb *.md # convert all .md files to notebooks with no outputs | ||
jupytext --to ipynb --execute *.md # convert all .md files to notebooks and execute them | ||
jupytext --set-formats ipynb,md --execute *.md # convert all .md files to paired notebooks and execute them | ||
``` | ||
|
||
## Which files should I version control? | ||
|
||
Unless you want to version the outputs, you should version *only the text representation*. The paired `.ipynb` file can safely be deleted. It will be recreated locally the next time you open the notebook (from the text file) and save it. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A lot of people also use notebooks as document previews in eg Github, where code cells have been executed and the readable document is the complete one. This speaks more to then having a separation of concerns, which different directory paths for paired documents can help with, between a document for version control, and a document for reading / display. |
||
|
||
Note that if you version both the `.md` and `.ipynb` files, you can configure `git diff` to [ignore the diffs on the `.ipynb` files](https://github.com/mwouts/jupytext/issues/251). | ||
|
||
## Jupyter warns me that the file has changed on disk | ||
|
||
By default, Jupyter saves your notebook every 2 minutes. Fortunately, it is also aware that you have edited the text file, yielding this message. | ||
|
||
You should simply click on _Reload_. | ||
|
||
Note you can deactivate Jupyter's autosave function with the Jupytext Menu in Jupyter Notebook, and with the _Autosave Document_ setting in JupyterLab. See [here](https://stackoverflow.com/questions/25631344/turn-off-autosave-in-ipython-notebook/56549758#56549758) if you want to permanently deactivate the autosave in Jupyter Notebook. | ||
|
||
## When I reload, Jupyter warns me that my notebook has unsaved changes | ||
|
||
Oh - you have edited both the notebook and the paired text file at the same time? If you know which version you want to keep, save it and reload the other. If you want to compare and merge both versions, backup the text file (with e.g. `git stash`), save the notebook, and merge the updated paired file with the backup (with e.g. `git stash pop`). Then, refresh the notebook in Jupyter. | ||
|
||
If your IDE has the ability to compare the changes in memory versus on disk (like PyCharm), you can simply save the notebook and let your IDE do the merge. | ||
|
||
## Jupyter complains that the `.ipynb` file is more recent than the text representation | ||
|
||
This happens if you have edited the `.ipynb` file outside of Jupyter. It is a safeguard to avoid overwriting the input cells of the notebook with an outdated text file. | ||
|
||
Manual action is requested as the paired text representation may be outdated. Please edit (`touch`) the paired `.md` or `.py` file if it is not outdated, or if it is, delete it, or update it with | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The reference to |
||
```bash | ||
jupytext --sync notebook.ipynb | ||
``` | ||
|
||
## Can I use Jupytext with Jupyter Hub, Binder, Nteract, Colab, Saturn or Azure? | ||
|
||
Jupytext is compatible with Jupyter Hub (execute `pip install jupytext --user` to install it in user mode) and with Binder (add `jupytext` to the project requirements and `jupyter lab build` to `postBuild`). | ||
|
||
If you use another editor than Jupyter Notebook, Lab or Hub, you probably can't get Jupytext there. However you can still use Jupytext at the command line to manually sync the two representations of the notebook: | ||
|
||
```bash | ||
jupytext --set-formats ipynb,py:light notebook.ipynb # Pair a notebook to a light script | ||
jupytext --sync notebook.ipynb # Sync the two representations | ||
``` | ||
|
||
## Can I re-write my git history to use text files instead of notebooks? | ||
|
||
Indeed! You can substitute every `.ipynb` file in the project history with its Jupytext Markdown representation using e.g.: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't this something of a nuclear option, removing the ipynb files? Maybe also handy to provide a way to just move / stash the processed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure! And also rewriting the history is something that one should not do too often... Maybe I'll just mention that as an fun exercise 😄 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The rewriting history thing is absolutely brilliant though! :-) |
||
```bash | ||
git filter-branch --tree-filter 'jupytext --to md */*.ipynb && rm -f */*.ipynb' HEAD | ||
``` | ||
|
||
See the result and the cleaner diff history in the case of the [Python Data Science Handbook](https://github.com/mwouts/PythonDataScienceHandbook/tree/jupytext_no_ipynb). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd emphasize the two-way integration more prominently. The first sentence of the paragraph makes it sound redundant with nbconvert, I think the exceptional focus of jupytext is two-way conversion, rather than just one-way conversion. Something like "Jupytext is a Python package that provides two-way conversion between Jupyter Notebooks and several other text-based formats."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and no to the two way conversion? I've started looking at how I could go 'ipynb free', authoring either in just markdown or py files (md only demo / works in MyBinder) perhaps with a next step later on of saving rendered notebooks as HTML or PDF (an output format) rather than ipynb.
Under this way of working, Jupytext is used to make python/markdown files editable/ executable as notebooks, but not saveable as notebooks, which makes it a one way process?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I see your point - you don't convert to an ipynb file any more. When you render the notebook, it's only the text to notebook conversion that is involved. Still I think that putting the emphasis on the two-way conversion is good: people know that the way back to the notebook is the least common part.