-
Notifications
You must be signed in to change notification settings - Fork 600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fenced code block parsing is not CommonMark compliant #410
Comments
Quick question -- if I were going to do a PR to fix this, should I submit it against v1 or v2? I notice that both v1 and v2 test suites lack coverage for the case of a multi-word opening fence, so in principle at least, it isn't a backward-incompatible change. 😄 |
v2. v1 is only kept for backward compatibility, all new stuff should go to v2. CommonMark compliance is still nowhere near halfway started (I only added some tests on this branch). I was thinking to reap some low-hanging whitespace incompatibility fruits on rendering side, then introduce a flag for CommonMark compatibility and start working on parse-side incompatibilities like this one. So yeah, if you're interested in getting started with it, you're very welcome :-). Submit a PR against v2 and let's hash out the details there. (Or here, if you feel like there's more to discuss before diving into code). |
My main question is whether this should be considered a bug fix -- i.e., safe to introduce with or without CommonMark, since there isn't any sane markdown source I can think of where the current fence rejection is a feature. If that's the case, the fix would be pretty easy and backportable to v1 as well. (Gitea uses v1 AFAICT, and they're my reason for investigating this in the first place.) The specifics of the fix would be to add tests for the desired behavior, and change this chunk of code to just skip to the end of the line instead of failing when non-whitespace is detected. But yeah, what I'm proposing is to make those changes, without putting them behind a CommonMark flag or the test suite, just making it forward compatible. And I'd also ideally like to backport the bug fix to v1 so that Gitea can benefit. Thoughts? |
According to the common mark standard, code fence info strings can be anything, not just single words. Update the tests and parser accordingly. The formatter already expected an info string with a language and HTML classes, so this does not need to change. Update the LaTeX formatter to take the first word of the info string as the language. Fixes russross#410 (in v1).
This is a pretty important feature to me. I'm migrating from Jekyll to Hugo and the code fence plugin for Jekyll accepted longer info strings, so I have to go through my posts and delete lots of information. |
According to common mark, the info string for a fenced code block can be any non-whitespace string, so adjust the code to read a full string instead of just the syntax name. Fixes russross#410 in v2.
According to common mark, the info string for a fenced code block can be any non-whitespace string, so adjust the code to read a full string instead of just the syntax name. Fixes russross#410 in v2.
* Accept info strings in code fences According to the common mark standard, code fence info strings can be anything, not just single words. Update the tests and parser accordingly. The formatter already expected an info string with a language and HTML classes, so this does not need to change. Update the LaTeX formatter to take the first word of the info string as the language. Fixes #410 (in v1). * Don't output whole info string as code classes This follows the common mark specification. * run go fmt
* Add full info string in fenced code blocks According to common mark, the info string for a fenced code block can be any non-whitespace string, so adjust the code to read a full string instead of just the syntax name. Fixes #410 in v2. * run go fmt
* Accept info strings in code fences According to the common mark standard, code fence info strings can be anything, not just single words. Update the tests and parser accordingly. The formatter already expected an info string with a language and HTML classes, so this does not need to change. Update the LaTeX formatter to take the first word of the info string as the language. Fixes russross#410 (in v1). * Don't output whole info string as code classes This follows the common mark specification. * run go fmt
blackfriday rejects opening code fences that contain more than one word, unless the words are enclosed in
{ }
. But CommonMark explicitly allows multiple words in opening fences, as can be seen at e.g. this example in the CommonMark spec. (The only character not allowed in a code fence "info string" is a backquote.)See also this related issue on gitea, and sample renderings via Gitea and Github of a markdown document containing such multi-word opening fences. Gitea's rendering breaks because blackfriday does not recognize them as fences, while Github's rendering uses the first word to select the language for highlighting (as shown in the example from the CommonMark spec).
As far as I can tell, the issue applies to both v1 and v2.
The text was updated successfully, but these errors were encountered: