-
Notifications
You must be signed in to change notification settings - Fork 528
Add CSS classes to fenced code blocks #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ine callback regex seemed to not be working, so I removed the ^ character from before \n.
By the way, all you have to do if you want to update the pull request is to put your changes in the same branch. But there's no harm in creating a new pull request if you want. Thanks for your work. I'll definitely merge this once it's ready. |
What is the expected fenced codeblock going to look like? Will it be something like:
or
... or will it support both? |
@simensen only |
@michelf Depends on what you find to be a good reason. :) GFM actually support both with and without the I know that it won't make sense to mirror the GFM, sundown, redcarpet, etc. parsers feature by feature, but if it is possible to make the leading Are there good reasons that the |
Making the |
I don't really see a good reason. But I like the idea of the code block using whatever you pass it as the class name, instead of other versions where it only allows the language. Because, for example, Google's Prettify uses the class "prettyprint lang-php" to work. So this way you can specify that easily. |
I hit submit too soon. That's a good reason I think. Maybe we could still make it optional and if no |
Great discussion. I can see how it would be nice to be able to support something like If it is possible to make the In any event, I'm excited to see this support coming here. :) |
Just pushed some changes that allows both. You can specify IDs, classes, and any other attributes you like. Examples
|
@coreyworrell very cool. :) |
I would like to point out that the HTML5 spec provides a suggestion (not a requirement) for how language is to be indicated in both inline code and code blocks. I note a few things you are doing that do not follow that suggestion here: You are putting the classes, etc on the pre tag, not the code tag. If you think about it, the language applies to the "code" and not all pre tags necessarily have code in them, so that makes sense. I realize many (most) JavaScript libs expect the language to be set on the pre tag but IMO those libs are wrong and they should change to match the HTML5 spec's suggestion. It seems to me like Markdown should do the right thing here from the start. The second is that the language should be indicated as Thirdly, if you are going to support more than a dot for classes and # for ids, then I would suggest following Maruku's Attribute List feature which is already supported by multiple (I'm aware of three) markdown implementations. That's how I'm intending to implement upcoming improvements to Python-Markdown. |
@waylan good suggestions. I agree with them, but it's the Javascript libraries, like you said, that almost always use classes on the Maruku's attribute list feature looks interesting, and might fit better with Markdown, but using the standard CSS selectors is also very common. We'll have to get opinions on this. |
I'm still on the fence with all this. ;-) Here are some thoughts... Regarding the Github flavor of fenced code blocks incorporating the language name, they aren't using the same syntax at all, and I don't see the point in trying to accommodate its syntax only in half. Is there any other Markdown variant implementing fenced code block using I know that Markdown usually try to let people express the same thing in different ways. This is why there are two emphasis markers and in Extra two code block syntaxes. But duplicating features should be done only where there is a strong justification for doing so. I don't think it's a good idea to make Neither do I think it is a good idea to automatically translate Where to put the attribute. Finally when was the last time anyone here had to put an attribute other than |
Michel, I can't argue much with the points that you make except to say that my personal opinion differs on a few. The better styling hook verses more semantically correct issue is something I've gone back and forth on, but when I saw the suggestion in the HTML5 spec, that convinced me. I realize it won't convince everyone and it seems to have not affected any of the JavaScript lib authors. Personally, I find it appalling the number of different (and sometimes non-validating) ways that JS lib authors have implemented language labeling. If there was any chance that a standard defined by markdown could force their hand, I'd be all for it, but I highly doubt that would work. Regarding other implementations, Python-Markdown has a working implementation (which only currently works on document root) and the dot is optional (not documented), but does nothing to change the behavior. Just the other day I accepted a pull request which also adds Github's syntax as an alternative (still need to document that). While we support Maruku's Attribute List syntax elsewhere, that does not currently work on fenced code blocks although we optionally allow the curly brackets which is consistent with that syntax. Speaking of Github's syntax, I protested when they introduced their new syntax, and they responded by implementing (but not documenting) the existing syntax and saying they didn't care what Markdown's syntax actually was. I don't think they even knew prior art existed before their implementation. |
It is intuitive that Also, Highlight.js is one of the best code highlighters available and uses Perhaps the middle ground is to have an optional setting like: @define( 'CSS_CLASS_PREFIX', 'language-' ); // default is null Also it seems correct to have the class on the @define( 'CSS_CLASS_TAG', 'pre' ); // default is 'code' |
I think @will123195's proposal is a nice compromise. I also prefer Highlight.js and that is what I use personally. In fact to my knowledge it is the only such tool which allows you to pass in a DOM element as a code block, so you can use any type of markup you want. Of course, this same tool is the one that has the most sensible default of all of them which negates the need for that feature in the first place. You do have to use their method of defining the language used, although it is a much less objectionable method than many. I will also note that while I would prefer using Regarding the option to set the language on either the Regarding @michelf's question regarding the need to support other attributes beside |
It seems that Github supports these forms:
If I understand well, this is what is supported in Python Markdown too (although I'm not sure about the id). I think we should standardize on those. Also, if we allow multiple class names and I wouldn't support arbitrary attributes. Supporting arcane highlighting libraries using custom attributes should not be encouraged, so I'd leave those out. As for adding the "language-" prefix, I just wouldn't. I think giving As for
|
@michelf I wasn't sure about the block hashing and what needs to be changed there? Is that only for using fenced code blocks inside HTML elements? |
The HTML block parser hashes everything that looks like a non-indented HTML block so HTML blocks don't get processed by the regexes that follows. Indented HTML blocks are standard Markdown code blocks so they are ignored by the HTML block parser. It also does the same for fenced code blocks, so it must detect where those blocks begin and ends. So there's in reality two fenced code block parsers: the first to run is inside the HTML block parser which detects fenced code blocks and leaves the block and its content intact for the second "real" fenced code block parser. I recognize it's a little silly, but this is how things have evolved. Ideally we'd parse all block-level constructs in one pass similarly to how it is done in the HTML block parser. It'd help fix a few issues like this one where both fenced code block parsers are a little out of sync. Anyway, the HTML block parser since it needs to handle HTML tags, indented code blocks, and fenced code blocks -- and to some extent code spans -- all together works using a tokenizing approach. There's a giant regex for going to the next "token" (the beginning of any of those) and a couple of "ifs" following to check what the token is and act appropriately. What you need is to change the regex for the opening fence (in the giant regex) and the closing fence (in the token handler code). |
Ok thanks, that makes sense. Can you give me some example code and the expected output to test on? I am running the current code on my dev site, and it seems to be working fine right now, even w/out changing those two regexs. Here's the markdown I'm using and the HTML output:
And that correctly prints out:
|
@coreyworrell Sorry for taking so long to answer. Regarding potential problems with the HTML block parser, you should test with this:
... and other similar variants you can come with. I expect it'll give strange results. |
Thanks. I tested against that, and all I had to change was the fenced code block opening marker regex for it to work. But it still fails against the following (which I think should all work?), these also don't work with the current version of php-markdown extra (I removed the class names from the fence block of course before testing):
You can see the test output here: http://dev.coreyworrell.com/first-article |
@coreyworrell You found another bug, but I'd say it's a separate issue. So I think this patch is ready to be merged. I'll do some tests of my own and merge it some time before next release. Good work! |
@michelf After reading all the comments above, I am a tad-bit confused, and would love to have some clarification -- has this syntax been finalized yet? Are there any known bugs? Also, please check if I am using the right syntax in the example below. Syntax for Adding Classes, IDs and Custom Attributes to code blocks
Gives this:
Considering that I am using PHP Markdown Extra 1.2.5 provided here, is that correct, or am I missing something? |
@geekpanth3r This pull request has not been merged yet. I looked at the patch and I'm confident it is good, but I want to test it myself before merging it in the official version, and I hadn't had time for that yet. My goal is to do a new release this summer that'll include those changes. The idea currently is to support id and class, but not arbitrary attributes. |
@michelf I am using SyntaxHighlighter by Alex Gorbatchev, and it requires that I add a class to it (to define the language). As of now, is there a standard way for me to do this? I mean, adding class to a pre tag. If I use direct HTML, I should escape the code, which isn't a good idea (it can break, not standard). |
@geekpanth3r No, not as of now. You can use my fork of the repo, or wait until this pull request is merged. Or use googles prettyprint, which auto-detects code block languages, thus not requiring any classes. |
@coreyworrell Are you aware of any issues with your fork? And even if there are, do you believe it won't break things when @michelf fixes them this summer? I would love to use it!! :D
I have tried that, and am currently using an even better solution (Highlight.js). Both are great, but they are not able to recognize "small" code blocks, and large code blocks with multiple languages, well. I will still continue to use Highlight.js though, but want to define the languages myself. :) |
Everything I tested works fine with the fenced code blocks. There is some interesting behavior when including code blocks inside of |
@coreyworrell and @michelf : I am using Corey's forked version of PHP Markdown Extra, and looks like there's a glitch. When I put this fenced code block in my post, nothing shows up, just a blank container:
Shows up like this in the front-end (yeah, no code, just the empty code container: Am I doing something wrong, or is this a bug? UPDATE: The source code of my page shows this -- clearly, it's being parsed as a comment, no idea why. PS: This appears to happen even with the simple fence (~~~) without the class; but it's fine when I indent it (4 spaces). |
"clearly, it's being parsed as a comment, no idea why" Looks like the HTML block parser doesn't see the code block in this case and treats the content as if it was an HTML block (which is the correct thing to do when outside a code block). @geekpanth3r If you remove the ".php" after the opening fence, does the same thing happen? |
@michelf So yeah, like I said earlier, this happens even without the |
@geekpanth3r would you mind trying the same code (without the |
@coreyworrell Oops! Looks like the issue is NOT with PHP Markdown Extra, but your fork. Yes, looks like you need to fix the HTML block parser. |
Is this going to make it in by chance? |
Ok, so I just merged this. While merging, I did some refactoring:
Now its time to check all the issues that have been reported for that fork to see if they still apply. Thank you for this @coreyworrell. Even though I ended up changing a few things, its pointed me at everything that needed to be changed, it's a very clean patch. |
Awesome. I know the edits I made didn't work quite right, but I'm glad you were able to use them to build off of! |
Not sure about the newline callback, but it seemed to not be adding the necessary
tag.