-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for #108, CommonMark 0.30 spec compliance, part 2 #165
Conversation
Great, thank you for another contribution. :) So it looks like, because spec examples 312 and 313 are already fixed in master now, there will be just 1 failing example left after the fixes provided in here, namely:
... the last challenge. ;) |
Yes, at first I thought it was caused by the strikethrough token, but that can't be. Because the code fences should already be parsed on block level before the strikethrough can kick in on span level. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anderskaplan, good job, I think all the fixes are correct. 👍 I believe there is room for improving the unit tests though (I'm a big fan of having also the test code simple and consistent, to some extent, of course. ;)) - see my suggestions. Thank you.
Yeah, it looks like mistletoe cannot handle the following correctly (stated above the failing Example 146):
So I believe this condition is the culprit of mistletoe's failure: if leader[0] in lang or leader[0] in line[match_obj.end():]:
return False I.e. https://github.com/miyuchina/mistletoe/blob/v0.9.0/mistletoe/block_token.py#L430. |
hey @pbodnar thanks for the review! I'll try and find some time to update the tests during next week. |
fd4a4d8
to
b160a88
Compare
all done! the last failing test case will be in a PR of its own. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anderskaplan, thanks for incorporating all the suggestions, I think it really is all ready for merge now. :)
Yet, would you mind doing a final interactive rebase and squashing?
- I think it is fine to keep the "fix example X" commits separated. Also b160a88 could probably stand on its own.
- I would just remove
#
before the example numbers because of the GitHub automatic issue links and add#108
to the end of the summary line. - The commits on fine-tuning unit tests could be either squashed into their corresponding "feature commits" (too much work though?), or squashed into one?
b160a88
to
488ad8b
Compare
Updated the commit messages and squashed the unit test updates according to review comments. Should be good to go now! |
(note: after looking in the commits history, I have added parentheses around the number) @anderskaplan, maybe you have overlooked this part (which groups the commits together)? Apart from that, the commits seem perfect now. :) |
…nation may contain spaces if it is enclosed in pointy brackets. (miyuchina#108) Solved by allowing spaces in core_tokens.match_link_dest(). Also fixed a problem in core_tokens.is_link_label() where the return value would be malformed if the root element isn't set. This does not happen under normal circumstances, but it caused unit tests to fail.
…tion may contain spaces if it is enclosed in pointy brackets. (miyuchina#108) Solved by allowing spaces in Footnote.match_link_dest().
…sis delimiter lengths being multiples of 3. (miyuchina#108) Solved by adding a missing condition in Delimiter.closed_by(). Also refactored the function to make it more readable.
…blocks may contain blank lines. (miyuchina#108) Solved by adding textarea to the set of HTML tags which allow newlines.
…pace in HTML tags. (miyuchina#108) The problem was that the regex for unquoted attribute values did not break at newlines. Therefore, in this example, the attribute value "baz\nbim!bop" was accepted when it wasn't supposed to. Solved by modifying the regex to break at any whitespace character, not only space.
…tities. (miyuchina#108) The problem was that the length of html entities like � was not checked, leading to the generation of invalid output. The solution was to modify the regex used to match charrefs to include limits on the lengths (as proposed by @pbodnar). The regex workaround was also moved from the HTMLRenderer class to span_token.tokenize() to make it available to all renderers.
…, and inline link parsing. Added positive test for valid html entities.
488ad8b
to
7866aca
Compare
Thank you! |
A set of small fixes which improve the CommonMark spec compliance. See the commit messages for details.