Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: --sync as default when script is older than notebook #254

Closed
GeeCastro opened this issue Jun 12, 2019 · 3 comments
Closed

Suggestion: --sync as default when script is older than notebook #254

GeeCastro opened this issue Jun 12, 2019 · 3 comments

Comments

@GeeCastro
Copy link

After building a notebook from python script:

jupytext --to notebook script.py # produces script.ipynb
jupyter nbconvert --execute --to notebook --inplace script.ipynb # runs the notebook

the notebook is not usable in jupyter. Error from jupytext extension:

Error loading notebook
script.ipynb (last modified *blabla*) seems more recent than script.py (last modified *blabla*) 
Please either: 
    - open script.py in a text editor, make sure it is up to date, and save it, 
    - or delete script.py if not up to date, 
    - or increase check margin by adding, say, c.ContentsManager.outdated_text_notebook_margin = 5 # in seconds # or float("inf") to your .jupyter/jupyter_notebook_config.py file 

This error is clearly understandable (thanks!). However it happens only if the notebook is more recent than the python script. If the notebook is older (the otherway around), it is synced to the python script.
So it's necessary to do what's advised in the error or (my pref):

jupytext --sync script.ipynb

Wouldn't it be better to have the same behavior in both ways ? If not I'd be interested to learn why :-).
Thank you for this tool !!

@mwouts
Copy link
Owner

mwouts commented Jun 12, 2019

Excellent question... Thanks for asking!

Well... when we initially implemented paired notebooks we said that

  • input cells would be loaded from the Python script
  • and output cells, from the ipynb file.

We preferred that implementation over timestamp priorities because we were not confident that we could trust timestamps. Notice that the check on timestamps is only a safeguard here, and aims at protecting the users who have turned off Jupytext for a while, and then on again, from having their notebooks reverting to the obsolete version contained in the script...

As you noticed, the --sync option of jupytext command line works differently. That one is timestamp-based, and will take the input cells from the most recent document in the pair. Let me try to defend why Jupytext CLI should work differently from Jupytext in Jupyter...

  • Jupyter users never need to update the script from the ipynb, because the only way they can update the ipynb file is with Jupyter, which also takes care of updating the script
  • Command line users have their eyes on the console, so if jupytext command line reads the inputs cells from a different file, they will see it
  • They are also expected to better know how each file in the pair was last updated.

Does that sound OK? Obviously, if you had a strong argument in the opposite direction, we could consider adding an option for the contents manager to load the input cells from the most recent file and reproduce in Jupyter the same behavior as with the --sync option, but I am not sure that would make the Jupytext experience as safe as I'd prefer it to be.

Last but not least, your example is a particular case of #231. Would you like an --execute option in Jupytext CLI so that you could type

jupytext script.py --set-formats py,ipynb --sync --execute

and get a paired notebook that Jupyter can open? If so, can you tell me if you usually pass any option to jupyter nbconvert (kernel, timeout, ...) ?

@GeeCastro
Copy link
Author

Good explanation, these choices make sens. If --sync was default it could lead to unexpected changes which the user would'nt easily understand and that would be very frustrating !

My opinion about that execute bit is no jupytext shouldn't handle this part. The use case here is pretty simple because it's only with jupyter. However, jupytext being multi language, wouldn't it be misleading for the user (for parameters for instance as mentioned in #231) ? I think it's clearer having to call the specific tool (i.e. Jupyter here) and to pass all the options there. It's only one line in a script (two if synced) and makes it highly readable.

@mwouts
Copy link
Owner

mwouts commented Jun 13, 2019

Thanks @Chichilele for your comments. It's very interesting for me to debate about these points, and determine what can be useful (or not) for users!

My opinion about that execute bit is no jupytext shouldn't handle this part.

Interesting. I think that I mostly agree - executing notebooks is a different thing that converting them to alternative formats. Still... I'd appreciate to have a simple way to generate a paired notebook with outputs from a text document. Maybe "simple" here could just mean "well documented", using the three commands that you mention. Or maybe we could use pipes to avoid writing the document multiple times on disk... we'll discuss this later on at #231 !

mwouts added a commit that referenced this issue Sep 21, 2019
To allow paired notebooks to be opened in Jupyter later on.
Closes #254 #335
mwouts added a commit that referenced this issue Oct 12, 2019
To allow paired notebooks to be opened in Jupyter later on.
Closes #254 #335
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants