-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Markdown renderer #162
Markdown renderer #162
Conversation
@anderskaplan, thank you for another contribution. :) I will try to check it out soon, but of course, I cannot promise anything. Hopefully we'll also get feedback from some other contributors this time. (I think it would be ideal to firstly finish all the current smaller changes (when compared to this one) and finally release version 1.0.0 and this one could go for example into 1.1.0, but we'll see...)
Yeah, I'm not a big fan of the "contrib" folder as well, I would probably vote for getting rid of it ASAP, see #101 for more details & discussion. |
hey, any updates on this? :) |
Hi @karlicoss, this branch is dependent on #172. As soon as that PR has been merged, I'll be able to prepare this one for showtime. |
2eaf420
to
cc22be5
Compare
Hi, I had no deep look into the code but I hope that someone else knows the answer: Is implementing a (live) syntax highlighter based on mistletoe possible after these changes? As far as I have seen, it is not the positions that are saved (which would be best for my specific case), but the syntax elements like And: Really thank you all for this great Markdown parser! |
Hi @geolta , it depends a lot what you mean by "(live) syntax highlighter". I suppose "live" means that it highlights while the user is typing. But what is it you want to highlight -- is it for example headings, emphasized text, and the like? And is this for a user interface written in python? If the answer is yes in both cases, then I think mistletoe should be a good match. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anderskaplan, thanks a lot. I have reviewed the first 2 commits for now, giving you some feedback until I find the time for another review session...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't completed the review yet, getting close slowly :P, here are another few remarks...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another round complete. :P :)
c82c82d
to
5445744
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anderskaplan, thank you. Hopefully I will get to the rest of the review soon (this week(end)).
It's been a while since the beginning of this PR. I think that we could finish just the minor cleanups and put this long awaited feature into the very next release version, with a warning that the renderer's API is in a "preview" state and it might change in a future version. What do you think? |
Sorry for the slow responses lately @pbodnar -- It's been a very busy May. I fully agree that it would be very nice to get this into the next major release, and I'll try and resolve the remaining comments as soon as possible. |
This question should also be addressed before merging, imho:
I think moving all optional tokens into a module could be pretty neat. What do you say @pbodnar ? |
Or maybe 3 new tokens currently if I look correctly ( |
Three they are. I think they were four at some point, but the last one turned out not to be necessary. In any case, if you think it's ok that they stay where they are, then I'll just leave them there. |
There, I believe all comments are addressed now. 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anderskaplan, thanks a lot. Just 1 or 2 smaller remarks which I think we could look at before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anderskaplan, thanks for another round. :)
Do you think you can prepare a final set of squashed atomic commits, ideally with commit messages as described at https://github.com/miyuchina/mistletoe/blob/master/CONTRIBUTING.md? At minimum, I would like to have all summary lines end with (#162)
, which I believe is pretty practical think. Sure, I can do this myself, but this is still kinda of your "child", so... :)
…e source Markdown in the tokens. Improved docstrings. (miyuchina#162)
1c2d346
to
61cec4a
Compare
Here we go! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anderskaplan, good job, thank you once again. :) If you don't mind, I will give you explicit credits in the release notes.
I have also run python test/benchmark.py mistletoe
to see performance impact of this PR. It looks like there is about 1 to 2 % performance penalty due to the extended parsing. Nothing to worry about then.
…e source Markdown in the tokens. Improved docstrings. (#162)
This is a pull request for a Markdown-to-Markdown renderer (#4).
A few notes on the implementation:
Since the aim is to preserve as much of the formatting as possible, and also steer clear of issues like if
the sequence
_**Hello**_
would be rendered as***Hello***
(which changes its interpretation),I had to save some formatting information from the parsing in the tokens. The formatting info will not appear in the output from the ASTRenderer or from repr().
The Markdown renderer can also do word wrapping. (Because I needed that for a use case.) But that's optional, and when it's not enabled, the goal is still to make the roundtrip Markdown -> parse -> render -> Markdown with as few changes as possible.
The test cases in
TestMarkdownRenderer
give a pretty good idea of what the renderer is capable of.In particular, note that it handles nested blocks: you can put a code block inside a list inside a block quote
if you like. (This is explicitly allowed by the CommonMark spec.)
Three new tokens! Well actually, only one which is completely new and that's the
BlankLine
.The other two represent link reference definitions (a.k.a. footnotes) and inherit from the
Footnote
token.These tokens are used to represent content that is otherwise not included in the AST.
An open question is whether they belong in the same module as the markdown renderer class, or if they should
be moved somewhere else. Maybe create a new module to host all optional tokens?
The markdown renderer handles span tokens and block tokens differently.
Block tokens are rendered into chunks of lines.
Span tokens are first rendered into sequences of "fragments", which are essentially strings with some additional metadata attached, and then into chunks of lines.
Tables are supported.
I put the MarkdownRenderer in the main package, not in contrib, because that's what @miyuchina wrote in a comment on Markdown-to-Markdown renderer #4.
Hope this can be useful and looking forward to some review comments. As it's a fairly large PR, I'd recommend to review it commit by commit. All changes to existing code come first, and then finally the renderer itself.