[Markdown] Refactor fenced code blocks#4430
Merged
deathaxe merged 2 commits intosublimehq:masterfrom Feb 2, 2026
Merged
Conversation
This commit...
1. uses two layers of `embed` to
a) consume fenced code block punctuation in dedicated patterns of a single
embed/escape statement.
b) lazy load embedded syntax definitions on demand to avoid context sanity
limit to be exceeded.
This results in a single Oniguruma fallback context to be injected only
by the top-level embed/escape, which should prevent trouble with stack
overflows etc. in some circumstances.
It significantly reduces pattern redundancy when adding new syntax support,
as new blocks don't need to deal with punctuation related details.
Patterns for syntax highlighted code blocks immediately start matching
language names and don't need to provide individual escape patterns.
2. merges all included `fenced-...` contexts into `fenced-code-block-body`
to reduce syntax cache size and create a new set of contexts strictly
separated from previous structure. So if a syntax extends on this Markdown
to inject fenced code blocks, exactly only this one will fail, without
breaking whole syntax definition.
Contexts are merged as language name patterns are not expected to require
overrides. Content can be replaced by overriding new
`fenced-code-block-...-content` contexts.
3. removes unnecessary capture groups (leading whitespace) from
`fenced_code_block_start` pattern.
4. fixes patterns not allowing backticks in info strings of syntax highlighted
fenced code blocks with tildes.
5. enables arbitrary syntax highlighting in info strings, which is used for
pandoc style attributes at the moment.
Note: This is a breaking change with regards to 3rd-party syntax definitions,
which extend from core Markdown to add more fenced code block syntaxes.
Overall syntax cache is reduced by about 60kB.
Parsing performance bench-marked against syntax test file is unchanged.
keith-hall
previously approved these changes
Feb 1, 2026
deathaxe
added a commit
to SublimeText/CoffeeScript
that referenced
this pull request
Feb 1, 2026
deathaxe
added a commit
to SublimeText/Astro
that referenced
this pull request
Feb 1, 2026
deathaxe
added a commit
to SublimeText/CoffeeScript
that referenced
this pull request
Feb 1, 2026
... for template languages
keith-hall
approved these changes
Feb 2, 2026
michaelblyons
approved these changes
Feb 2, 2026
jrappen
reviewed
Feb 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR...
uses two layers of
embedto a) consume fenced code block punctuation in dedicated patterns of a single embed/escape statement. b) lazy load embedded syntax definitions on demand to avoid context sanity limit to be exceeded.This results in a single Oniguruma fallback context to be injected only by the top-level embed/escape, which should prevent trouble with stack overflows etc. in some circumstances.
It significantly reduces pattern redundancy when adding new syntax support, as new blocks don't need to deal with punctuation related details.
Patterns for syntax highlighted code blocks immediately start matching language names and don't need to provide individual escape patterns.
This works around opening an .md file having a line with 106 tildes or more causes Sublime Text 4 (build 4200) to quit sublime_text#6823
merges all included
fenced-...contexts intofenced-code-block-bodyto reduce syntax cache size and create a new set of contexts strictly separated from previous structure. So if a syntax extends on this Markdown to inject fenced code blocks, exactly only this one will fail, without breaking whole syntax definition.Contexts are merged as language name patterns are not expected to require overrides. Content can be replaced by overriding new
fenced-code-block-...-contentcontexts.removes unnecessary capture groups (leading whitespace) from
fenced_code_block_startpattern.fixes patterns not allowing backticks in info strings of syntax highlighted fenced code blocks with tildes.
enables arbitrary syntax highlighting in info strings, which is used for pandoc style attributes at the moment.
Notes:
This is a breaking change with regards to 3rd-party syntax definitions, which extend from core Markdown to add more fenced code block syntaxes.
Known packages are:
Overall syntax cache is reduced by about 60kB.
Parsing performance bench-marked against syntax test file is unchanged.