-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
markdown -> notebook bug: md code blocks with consecutive newlines #188
Comments
Hello @rsokl , thanks for reporting this, and for your nice feedback! Well, what happens here is that Jupytext separates consecutive markdown cells with two blank lines. Hence it also cuts markdown cells that happen to have two consecutive blank lines into multiple cells. I am open to suggestions if you see a better way to mark cell breaks in the markdown format. Maybe, when we have metadata support for Markdown (not the case for now #66) we can have explicit cell start/end delimiters. Also, please note that this issue will not occur if you save your notebook as scripts (in either |
Ah that makes sense. I can try to take a swing at this - could you point me to the function(s) that I would want to refactor? |
One though that I had is: don't look for the two blank lines if you are within a markdown code block. You currently track when a markdown code block begins with, you could potentially also look for the closing ``` marks. That is, once you see something like ```python, a new cell won't be created until after you see the closing ```. Clearly this is not guaranteed to be true, but I would suspect that far more people include two or more newlines in a markdown codeblock then they do open a code-block in a cell without a closing it. |
Sure, it should be possible to do something around these lines. Actually, rather than commenting the triple quotes with |
Yeah, that would be a very nice change indeed! As it is, various markdown preview clients are quite confused by that That being said, how would you deal with a substantial change like that from a versioning point of view? That change would prevent current jupytext-markdown files from being converted back to notebooks. Have you had changes like that in the past? |
No problem. Actually, a very useful input for this would be recommendations on how to write markdown comments in a way that is compatible with pandoc and markdown viewers. We've started considering that at #66.
Yes we had. That's the reason why we have a |
Ah! Of course. Great foresight there 😃
Gotchya. I'll mull this over and will let you know what I come up with! |
I noticed that, in converting ipynb to py, that jupytext is able to preserve the type of the raw-cell (e.g. it will note if it is a reST raw cell). However, this information is not preserved in converting to markdown. It seems like we might be able to lump this in with the markdown-delimiter effort, and thus have the markdown format preserve this during round-trip conversions as well. Thoughts? |
Comments in MarkdownMarkdown has a syntax in its core specification for making comments/invisible text. It looks like the most generic syntax for this is:
Note that the blank line preceding the There is an incredibly informative thread on stackoverflow about this. Someone tested the various syntaxes for including comments in markdown through Babelmark2, checking them against 28 markdown implementations. The analysis concluded that this syntax is the most general, and is supported by 23 of those implementations. This seems like a great path forward, imo 😄 JupytextGiven this discussion, it seems that this is the proper direction for (edit: it just occurred to me that you probably don't even need to delimit the end of a cell. It just goes until another begins) E.g. A markdown cell could be delimited by:
and similarly various types of code cells and raw-cells could follow suite:
What is really cool is that this will permit jupytext's markdown format to represent both python code cells and markdown cells with python code blocks in such a way that markdown viewers will render them both with syntax highlighting! |
Hello @rsokl , this is very interesting! Thanks for the links. Until now we had mostly considered HTML comments for storing metadata, but sure we can debate this, and yes, I will have a look at this SO thread! Also, I do agree that the updated Markdown format should preserve the raw cells. I am not available this week, but later in the month I will certainly give a try to improving the Markdown format. I suggest that we iterate over one or more tentative implementations, which you could test and provide feedback if you'd like? Thanks! |
Great! This will be a fantastic update for my use-case of The analysis provided in the SO thread is basically everything we could ask for. I.e. what comment style is most compatible across markdown implementations. I was floored that someone had already done it. I will add this information over in issue #66 I am happy to test and provide feedback on this content. And we may want to start small, and just take on:
Taking on generic metadata might be a bit more ambitious for a single patch, since that needs to support basically all of JSON in a markdown comment... |
This should be OK now with version 1.1. Please let me know otherwise. |
Any markdown code-block containing multiple consecutive newlines is parsed incorrectly during conversion to a notebook. The additional newlines will become distinct notebook cells. For example:
(note that the lines of code are separated by two newline characters)
becomes:
P.S.
jupytext
rocks! Thanks for the awesome work!The text was updated successfully, but these errors were encountered: