-
Notifications
You must be signed in to change notification settings - Fork 103
Comparing changes
Open a pull request
base repository: executablebooks/markdown-it-py
base: v4.0.0
head repository: executablebooks/markdown-it-py
compare: v4.2.0
- 10 commits
- 34 files changed
- 6 contributors
Commits on Dec 24, 2025
-
Add --stdin option to CLI for reading Markdown from standard input (#379
Configuration menu - View commit details
-
Copy full SHA for 49043e4 - Browse repository at this point
Copy the full SHA 49043e4View commit details
Commits on Feb 18, 2026
-
🔧 Add AGENTS.md and copilot-setup-steps workflow (#380)
* Initial plan * 📚 DOCS: Add AGENTS.md and copilot-setup-steps.yml workflow Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com> * 👌 IMPROVE: Add tox-uv note to AGENTS.md Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 2f6ae10 - Browse repository at this point
Copy the full SHA 2f6ae10View commit details -
🔧 Add typing to Scanner (#382)
Changes `state_inline.Scanner` into a `typing.NamedTuple` instead of a `collections.namedtuple`. They should be functionally equivalent except that the former has typing for its members while the latter does not.
Configuration menu - View commit details
-
Copy full SHA for 8933147 - Browse repository at this point
Copy the full SHA 8933147View commit details
Commits on May 6, 2026
-
👌 Fix quadratic complexity in
fragments_join/text_join(#389)Optimize adjacent-token joining in both inline cleanup stages by replacing repeated pairwise string concatenation with a single `"".join(...)` over each contiguous run. ## Details - `fragments_join` merges adjacent `text` tokens left behind after emphasis/strikethrough post-processing and recalculates token levels - `text_join` converts `text_special` tokens to `text` and performs the final adjacent-text merge in the inline token stream Both rules previously rebuilt growing strings incrementally, which can become quadratic for long runs. ## Why Tested on an adversarial ~190 KB document with ~30k intraword underscores on a single line. With `tracemalloc` running: | | render time | peak Python alloc | |-----------|-------------|-------------------| | before | 2.2s | 4476 MB | | after | 0.6s | 23 MB | It's not just a contrived attack input - this kind of thing also shows up naturally in Markdown produced by OCR pipelines, where tables of identifiers / references can easily contain very long runs of underscores or other delimiter characters. ## Tests Added focused tests for both rules: - `fragments_join`: verifies raw adjacent text fragments remain when both join stages are disabled, and that `fragments_join` alone collapses them when `text_join` is disabled - `text_join`: verifies escaped characters remain as multiple `text_special` tokens when `text_join` is disabled, and are converted and merged into a single `text` token when enabled ## Result No behavioral change in parser output, with less unnecessary work when joining long runs of adjacent tokens. --------- Co-authored-by: Chris Sewell <chrisj_sewell@hotmail.com>
Configuration menu - View commit details
-
Copy full SHA for d4ea0ca - Browse repository at this point
Copy the full SHA d4ea0caView commit details -
✨Allow plugins to register inline terminator characters (#391)
The inline `text` rule used a hardcoded, unexpandable set of terminator characters, forcing plugins that need to trigger on non-terminator characters (e.g. `w` for GFM `www.` autolinks) to resort to core-rule post-processing workarounds. ## Changes - **`parser_inline.py`**: Moves the terminator set onto `ParserInline` as `_terminator_chars` (a `set[str]` seeded from `_DEFAULT_TERMINATORS`) with a pre-compiled `terminator_re: re.Pattern[str]` attribute. Exposes `add_terminator_char(ch)` to extend the set; the regex is rebuilt eagerly only when a genuinely new character is added, keeping zero per-call overhead in the hot path. - **`rules_inline/text.py`**: Drops the module-level `_TerminatorChars` set and `@functools.cache`-decorated factory. The `text` rule now reads `state.md.inline.terminator_re` directly. - **`docs/contributing.md`**: Updates the "Why is my inline rule not executed?" FAQ to document the new API. ## Usage ```python def gfm_autolink_plugin(md: MarkdownIt) -> None: md.inline.add_terminator_char("w") md.inline.ruler.push("gfm_autolink_www", _www_rule) ``` Fully backward-compatible — the default terminator set is unchanged. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com> Co-authored-by: Chris Sewell <chrisj_sewell@hotmail.com>
Configuration menu - View commit details
-
Copy full SHA for df6fd36 - Browse repository at this point
Copy the full SHA df6fd36View commit details -
✨ Add
gfm-like2preset with task lists, alerts, and single-tilde st……rikethrough (#388) ### Summary Adds a new `gfm-like2` preset that extends `gfm-like` with three GFM features: - **Task lists** — `- [x] done` / `- [ ] todo` checkbox syntax in list items - **Alerts** — `> [!NOTE]`, `> [!TIP]`, `> [!WARNING]`, etc. inside blockquotes - **Single-tilde strikethrough** — `~text~` in addition to `~~text~~` These are enabled via the `gfm-like2` preset or individually through the `tasklists`, `alerts`, and `strikethrough_single_tilde` options. The existing `gfm-like` preset is unchanged, so as to remain back-compatible. ### Why in markdown-it-py, not mdit-py-plugins? Task lists and alerts are implemented by integrating detection directly into the existing block-level parsers (list.py and blockquote.py), rather than as post-processing rules: - Checkbox detection happens during list item parsing, before the sub-parser runs on the item content - Alert detection happens during blockquote parsing, before the inner content is tokenized This design is not achievable from a plugin: plugins can only add new rules or post-process the token stream — they cannot modify the internals of `list_block()` or `blockquote()` to inject detection at the right point in the parsing pipeline. Implementing these as post-processing core rules would work functionally, but it means re-walking and mutating the token stream after the fact, which is less clean and less consistent with how the block parsers are designed to work. Single-tilde strikethrough similarly extends the existing strikethrough rule's matching logic (opener/closer width matching), which is more naturally done inside the rule than bolted on externally. ### Changes - **`rules_block/list.py`** — Detect `[ ]`/`[x]`/`[X]` at content start during list item parsing; set `token.meta["checked"]`; advance `bMarks` past the checkbox; add CSS classes (`task-list-item`, `contains-task-list`) after the list loop - **`rules_block/blockquote.py`** — Detect `[!TYPE]` on the first content line; emit `alert_open`/`alert_close` + title tokens instead of `blockquote_open`/`blockquote_close`; skip the marker line during tokenization - **`rules_inline/strikethrough.py`** — When `strikethrough_single_tilde` is enabled, accept 1 or 2 tildes (reject 3+); enforce opener/closer width matching in `_postProcess` - **renderer.py** — Add `list_item_open` render method that injects checkbox HTML when `meta["checked"]` is present - **`presets/__init__.py`** — Add `gfm_like2` preset class - **`main.py`** — Register `gfm-like2` in `_PRESETS` - **utils.py** — Add `tasklists`, `alerts`, `strikethrough_single_tilde` keys to `OptionsType` - **pyproject.toml** — Add `pytest-timeout` to test deps; set 10s default timeout - **Test fixtures** — 11 tasklist cases, 15 alert cases, 13 single-tilde strikethrough cases ### Usage ```python from markdown_it import MarkdownIt md = MarkdownIt("gfm-like2") md.render("- [x] done\n- [ ] todo") md.render("> [!NOTE]\n> This is a note.") md.render("~strikethrough~") ```
Configuration menu - View commit details
-
Copy full SHA for 693bb24 - Browse repository at this point
Copy the full SHA 693bb24View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8951f26 - Browse repository at this point
Copy the full SHA 8951f26View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3b4ff6d - Browse repository at this point
Copy the full SHA 3b4ff6dView commit details
Commits on May 7, 2026
-
✨ Add
make_fence_rule()factory for configurable fence markers (#394)Adds a `make_fence_rule()` factory function that generates fence parsing rules with configurable options, enabling reuse of the fence logic for custom marker characters (e.g. `:` for colon fences) without code duplication. ## Motivation The `colon_fence` plugin in `mdit-py-plugins` duplicates nearly all of the fence parsing logic, differing only in the marker character (`:` vs `` ` ``/`~`), token type, and info-string restrictions. This makes maintenance harder and prevents features like exact-match closing from being shared. ## Changes - **New `make_fence_rule()` factory** in fence.py with parameters: - `markers`: tuple of valid fence marker characters (default `("~", "\`")`) - `token_type`: token type name to emit (default `"fence"`) - `exact_match`: when `True`, closing fence must have **exactly** the same marker count as the opener (default `False`). This enables nested fences (e.g. `::::` wrapping `:::`). - `disallow_marker_in_info`: marker characters that reject the fence if found in the info string (default `("\`",)` per CommonMark) - `min_markers`: minimum marker count to form a fence (default `3`) - **`fence` remains a module-level export**: `fence = make_fence_rule()` — fully backwards compatible, identical behavior to the previous implementation. - **Exported from `markdown_it.rules_block`** — added `make_fence_rule` to `__all__`. ## Usage ```python from markdown_it import MarkdownIt from markdown_it.rules_block.fence import make_fence_rule # Colon fence (replaces mdit-py-plugins/colon_fence duplicated logic) md = MarkdownIt() md.block.ruler.before("fence", "colon_fence", make_fence_rule(markers=(":",), token_type="colon_fence", disallow_marker_in_info=())) # Override standard fence with exact-match closing (for MyST nested directives) md.block.ruler.at("fence", make_fence_rule(exact_match=True)) # Combine: all markers with exact matching md.block.ruler.at("fence", make_fence_rule( markers=("~", "`", ":"), exact_match=True)) ``` ## Performance No regression by design — configuration is captured in the closure at rule-creation time (no runtime dict lookups or extra branches in the hot path). The `closing_matcher` callable is resolved once at factory time. ## Tests 21 new tests covering: - Colon-fence-like marker behavior - Exact-match closing (nesting patterns) - `ruler.at()` override of standard fence - `min_markers` customization - `disallow_marker_in_info` variations Full existing test suite (981 tests) passes unchanged.Configuration menu - View commit details
-
Copy full SHA for 96cf077 - Browse repository at this point
Copy the full SHA 96cf077View commit details -
Configuration menu - View commit details
-
Copy full SHA for 36c5f54 - Browse repository at this point
Copy the full SHA 36c5f54View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff v4.0.0...v4.2.0