-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added FAQ #274
Added FAQ #274
Conversation
Codecov Report
@@ Coverage Diff @@
## master #274 +/- ##
=======================================
Coverage 99.19% 99.19%
=======================================
Files 68 68
Lines 6612 6612
=======================================
Hits 6559 6559
Misses 53 53 Continue to review full report at Codecov.
|
@choldgraf, @psychemedia, may I ask your thoughts about this FAQ? Tony, does it answer some of your questions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few thoughts from me - in general I think this is a nice addition! Most of the suggestions were for clarity and organization stuff
@@ -0,0 +1,89 @@ | |||
# Frequently Asked Questions | |||
|
|||
## What is Jupytext? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd emphasize the two-way integration more prominently. The first sentence of the paragraph makes it sound redundant with nbconvert, I think the exceptional focus of jupytext is two-way conversion, rather than just one-way conversion. Something like "Jupytext is a Python package that provides two-way conversion between Jupyter Notebooks and several other text-based formats."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and no to the two way conversion? I've started looking at how I could go 'ipynb free', authoring either in just markdown or py files (md only demo / works in MyBinder) perhaps with a next step later on of saving rendered notebooks as HTML or PDF (an output format) rather than ipynb.
Under this way of working, Jupytext is used to make python/markdown files editable/ executable as notebooks, but not saveable as notebooks, which makes it a one way process?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I see your point - you don't convert to an ipynb file any more. When you render the notebook, it's only the text to notebook conversion that is involved. Still I think that putting the emphasis on the two-way conversion is good: people know that the way back to the notebook is the least common part.
|
||
The text representation have much cleaner diffs than the original notebook format. Merging multiple contributions to a notebook in any of these text formats is easier than with the JSON format. Last but not least, acting on a notebook represented as text (spell check, reformat, ...) is sometimes more comfortable than in Jupyter. | ||
|
||
## How do I use Jupytext? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO here you should just link to the "using jupytext" section of the documentation. If that documentation is too wordy and complex, then I'd add this short example as a quick "getting started" section at the top of the "using jupytext" section, rather than in the FAQ
docs/faq.md
Outdated
|
||
## Which Jupytext format do you recommend? | ||
|
||
I tend to use the Markdown format for notebooks that contain more text than code, as Markdown documents are conveniently edited in IDEs and also well rendered on GitHub. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd avoid using the word "I" in package documentation, I usually use "we" in my packages even if I'm largely the one developing the package (the hope is always that one day the package developer community will be a "we" rather than an I one day :-) )
docs/faq.md
Outdated
|
||
## Can I edit the paired text file? | ||
|
||
Yes! And when you're done, refresh the notebook in Jupyter. Refreshing will bring the latest changes to your notebook. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work even if the text file is edited when a jupyter server isn't running?
e.g., if I have a notebook and paired markdown file, I synchronize them, and push them both to GitHub. Then somebody else updates just the markdown file and pushes the changes to GitHub. I pull in the changes, and turn on JupyterLab...what happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work even if the text file is edited when a jupyter server isn't running?
Indeed! You're free to close the server. There's no magic here: when the notebook is opened or refreshed, Jupytext reads the two files and merges inputs+outputs. The ipynb file is not modified when the notebook is read, but only the next time it is saved.
If you keep the notebook open, the extra bonus is that variables are preserved when you refresh the notebook. I'll add a question about that.
e.g., if I have a notebook and paired markdown file, I synchronize them, and push them both to GitHub. Then somebody else updates just the markdown file and pushes the changes to GitHub. I pull in the changes, and turn on JupyterLab...what happens?
What happens is what you expect: you get the latest input cells from the markdown file, matched with outputs from the ipynb file. It will work 100% if you don't push the ipynb file to GitHub, and only 99% (*) if you do push the ipynb file.
Let me explain: Jupytext in Jupyter is very strict about the assumption that Jupyter always writes the ipynb file before the md file. If git happens to write the ipynb more than one second after the md file, Jupytext will complain and refuse to open the notebook.
|
||
The `.ipynb` file contains the full notebook. The paired text file only contains the input cells and selected metadata. When the notebook is loaded by Jupyter, input cells are loaded from the text file, while the output cells and the filtered metadata are restored using the `.ipynb` file. | ||
|
||
## Can I create a notebook from a text file? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again for content like these, I'd rather make sure there is an explicit section in the "using jupytext" documentation and then link to that from the FAQ...
docs/faq.md
Outdated
|
||
## When I refresh, Jupyter warns me that my notebook has unsaved changes | ||
|
||
Oh - you have edited both the notebook and the paired text file at the same time? Backup the text file (`git stash`), save the notebook, and merge your changes on the text file (`git stash pop`). When you're done, refresh the notebook in Jupyter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this is going to be quite an advanced technique for most users - is there a simpler way to resolve this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a simpler way to resolve this?
None other than using a good editor, I am afraid! With PyCharm (mentionned at the next line), you can compare the diffs between memory and disk changes, that is very convenient.
But this is just a corner case - I think people will notice that they are changing the notebook in two different editors at the same time?
docs/faq.md
Outdated
jupytext --sync notebook.ipynb # Sync the two representations | ||
``` | ||
|
||
## If only I had known of Jupytext before! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the personality coming across in this one, but I think it'd still benefit from a more clear idea of what the section covers...e.g., "Can I re-write my git history to use text files instead of notebooks?"
Thank you so much @choldgraf ! These are all very useful remarks. I will update the text accordingly soon... |
@choldgraf , I have updated the text following your comments, thanks! I do agree that some of the points discussed here could also be documented in the other sections of the documentation - I suggest that we see that later on (as always, PR as welcome!) |
@@ -2,41 +2,51 @@ | |||
|
|||
## What is Jupytext? | |||
|
|||
Jupytext is a Python package that can convert Jupyter notebooks to scripts or Markdown documents. It can also convert these text documents back to Jupyter notebooks. | |||
Jupytext is a Python package that provides _two-way_ conversion between Jupyter Notebooks and several other text-based formats like Markdown documents or scripts. | |||
|
|||
## Why would I want to convert my notebooks to text? | |||
|
|||
The text representation have much cleaner diffs than the original notebook format. Merging multiple contributions to a notebook in any of these text formats is easier than with the JSON format. Last but not least, acting on a notebook represented as text (spell check, reformat, ...) is sometimes more comfortable than in Jupyter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the use cases I have for an md only version (no ipynb save) is a notebook where I am using database calls that display sensitive information as outputs, even if the db queries and code manipulations I do later on returned data are not sensitive.
By making sure I don't save the notebook, the fact that the saved md document does not contain code outputs (sensitive data) is a win.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting use case! We could add a mention that only the inputs are saved, that they match well what the user has effectively contributed to the notebook, and that as you say that's an effective way to drop the outputs which can be large or private.
|
||
Saving notebooks as scripts is a convenient choice when you want to refactor your notebook in an IDE (or import it in another notebook, etc). Use the `percent` format if you prefer to get explicit cell markers (compatible with VScode, PyCharm, Spyder, Hydrogen...). If you prefer to get the minimal amount of cell markers, go for the `light` format. | ||
Saving notebooks as scripts is an appropriate choice when you want to act on the code (refactor the code, import it in another script or notebook, etc). Use the `percent` format if you prefer to get explicit cell markers (compatible with VScode, PyCharm, Spyder, Hydrogen...). And if you prefer to get the minimal amount of cell markers, go for the `light` format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an aside — sorry; not sure when I'll get a chance to comment again — but maybe pertinent to the sentiment of this section: I wonder if Jupytext support for editing py files as notebooks may actually help address the issue of notebooks being an inappropriate medium for editing module files in an ad hoc development process with a judicious extension or two, eg to support code execution form NBFormat cells?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've answered your question there, maybe that's a point we'd like to see in this FAQ?
|
||
## How do paired notebooks work? | ||
|
||
The `.ipynb` file contains the full notebook. The paired text file only contains the input cells and selected metadata. When the notebook is loaded by Jupyter, input cells are loaded from the text file, while the output cells and the filtered metadata are restored using the `.ipynb` file. | ||
The `.ipynb` file contains the full notebook. The paired text file only contains the input cells and selected metadata. When the notebook is loaded by Jupyter, input cells are loaded from the text file, while the output cells and the filtered metadata are restored using the `.ipynb` file. When the notebook is saved in Jupyter, the two files are updated to match the current content of the notebook. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can click on either file to open it into the notebook editor, edit it and run it there, and when you save it, both files will be updated using the appropriate file format.
?What are likely problems if you have both files open at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly the same as if you are editing the same document in two editors. In short, as long as you modify just one of the two documents you're safe. You may find the autosave a bit annoying, but it won't hurt as Jupytext implements timestamps checks. Read more on this in the next Q/A.
@@ -45,25 +55,29 @@ jupytext --set-formats ipynb,md --execute *.md # convert all .md files to paire | |||
|
|||
## Which files should I version control? | |||
|
|||
Unless you want to version control the output cells, you should version the text file only (and add `*.ipynb` to `.gitignore`). As discussed above, Jupyter will let you open the text representation as a notebook and will re-create the `.ipynb` file when you save the notebook. | |||
Unless you want to version the outputs, you should version *only the text representation*. The paired `.ipynb` file can safely be deleted. It will be recreated locally the next time you open the notebook (from the text file) and save it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of people also use notebooks as document previews in eg Github, where code cells have been executed and the readable document is the complete one. This speaks more to then having a separation of concerns, which different directory paths for paired documents can help with, between a document for version control, and a document for reading / display.
This happens if you have edited the `.ipynb` file outside of Jupyter. Manual action is requested as the paired text representation may be outdated. Please edit (`touch`) the paired `.md` or `.py` file if it is not outdated, or if it is, delete it, or update it with | ||
This happens if you have edited the `.ipynb` file outside of Jupyter. It is a safeguard to avoid overwriting the input cells of the notebook with an outdated text file. | ||
|
||
Manual action is requested as the paired text representation may be outdated. Please edit (`touch`) the paired `.md` or `.py` file if it is not outdated, or if it is, delete it, or update it with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reference to touch
may be confusing. Is this something that an extension might help with, or a Jupytext menu item selection that can force the documents into alignment?
|
||
Do you feel like rewriting the history of your repository and replacing every `.ipynb` file with its Jupytext Markdown representation? Technically that's just a matter of executing: | ||
Indeed! You can substitute every `.ipynb` file in the project history with its Jupytext Markdown representation using e.g.: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this something of a nuclear option, removing the ipynb files? Maybe also handy to provide a way to just move / stash the processed .ipynb
files somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! And also rewriting the history is something that one should not do too often... Maybe I'll just mention that as an fun exercise 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rewriting history thing is absolutely brilliant though! :-)
No description provided.