Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: executablebooks/markdown-it-py
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v4.0.0
Choose a base ref
...
head repository: executablebooks/markdown-it-py
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v4.2.0
Choose a head ref
  • 10 commits
  • 34 files changed
  • 6 contributors

Commits on Dec 24, 2025

  1. Add --stdin option to CLI for reading Markdown from standard input (#379

    )
    
    - Add --stdin flag to parse_args for reading from stdin
    - Add convert_stdin() function to handle stdin parsing
    - Update main() to call convert_stdin() when --stdin flag is used
    - Add comprehensive tests for stdin functionality and CLI behavior
    
    Signed-off-by: Matěj Cepl <mcepl@cepl.eu>
    mcepl authored Dec 24, 2025
    Configuration menu
    Copy the full SHA
    49043e4 View commit details
    Browse the repository at this point in the history

Commits on Feb 18, 2026

  1. 🔧 Add AGENTS.md and copilot-setup-steps workflow (#380)

    * Initial plan
    
    * 📚 DOCS: Add AGENTS.md and copilot-setup-steps.yml workflow
    
    Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com>
    
    * 👌 IMPROVE: Add tox-uv note to AGENTS.md
    
    Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
    Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com>
    Copilot and chrisjsewell authored Feb 18, 2026
    Configuration menu
    Copy the full SHA
    2f6ae10 View commit details
    Browse the repository at this point in the history
  2. 🔧 Add typing to Scanner (#382)

    Changes `state_inline.Scanner` into a `typing.NamedTuple` instead of a
    `collections.namedtuple`. They should be functionally equivalent except
    that the former has typing for its members while the latter does not.
    Alunderin authored Feb 18, 2026
    Configuration menu
    Copy the full SHA
    8933147 View commit details
    Browse the repository at this point in the history

Commits on May 6, 2026

  1. 👌 Fix quadratic complexity in fragments_join / text_join (#389)

    Optimize adjacent-token joining in both inline cleanup stages by
    replacing repeated pairwise string concatenation with a single
    `"".join(...)` over each contiguous run.
    
    ## Details
    
    - `fragments_join` merges adjacent `text` tokens left behind after
    emphasis/strikethrough post-processing and recalculates token levels
    - `text_join` converts `text_special` tokens to `text` and performs the
    final adjacent-text merge in the inline token stream
    
    Both rules previously rebuilt growing strings incrementally, which can
    become quadratic for long runs.
    
    ## Why
    
    Tested on an adversarial ~190 KB document with ~30k intraword
    underscores on a single line. With `tracemalloc` running:
    
    |           | render time | peak Python alloc |
    |-----------|-------------|-------------------|
    | before    | 2.2s        | 4476 MB           |
    | after     | 0.6s        | 23 MB             |
    
    It's not just a contrived attack input - this kind of thing also shows
    up naturally in Markdown produced by OCR pipelines, where tables of
    identifiers / references can easily contain very long runs of
    underscores or other delimiter characters.
    
    ## Tests
    
    Added focused tests for both rules:
    
    - `fragments_join`: verifies raw adjacent text fragments remain when
    both join stages are disabled, and that `fragments_join` alone collapses
    them when `text_join` is disabled
    - `text_join`: verifies escaped characters remain as multiple
    `text_special` tokens when `text_join` is disabled, and are converted
    and merged into a single `text` token when enabled
    
    ## Result
    
    No behavioral change in parser output, with less unnecessary work when
    joining long runs of adjacent tokens.
    
    ---------
    
    Co-authored-by: Chris Sewell <chrisj_sewell@hotmail.com>
    petricevich and chrisjsewell authored May 6, 2026
    Configuration menu
    Copy the full SHA
    d4ea0ca View commit details
    Browse the repository at this point in the history
  2. ✨Allow plugins to register inline terminator characters (#391)

    The inline `text` rule used a hardcoded, unexpandable set of terminator
    characters, forcing plugins that need to trigger on non-terminator
    characters (e.g. `w` for GFM `www.` autolinks) to resort to core-rule
    post-processing workarounds.
    
    ## Changes
    
    - **`parser_inline.py`**: Moves the terminator set onto `ParserInline`
    as `_terminator_chars` (a `set[str]` seeded from `_DEFAULT_TERMINATORS`)
    with a pre-compiled `terminator_re: re.Pattern[str]` attribute. Exposes
    `add_terminator_char(ch)` to extend the set; the regex is rebuilt
    eagerly only when a genuinely new character is added, keeping zero
    per-call overhead in the hot path.
    
    - **`rules_inline/text.py`**: Drops the module-level `_TerminatorChars`
    set and `@functools.cache`-decorated factory. The `text` rule now reads
    `state.md.inline.terminator_re` directly.
    
    - **`docs/contributing.md`**: Updates the "Why is my inline rule not
    executed?" FAQ to document the new API.
    
    ## Usage
    
    ```python
    def gfm_autolink_plugin(md: MarkdownIt) -> None:
        md.inline.add_terminator_char("w")
        md.inline.ruler.push("gfm_autolink_www", _www_rule)
    ```
    
    Fully backward-compatible — the default terminator set is unchanged.
    
    ---------
    
    Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
    Co-authored-by: chrisjsewell <2997570+chrisjsewell@users.noreply.github.com>
    Co-authored-by: Chris Sewell <chrisj_sewell@hotmail.com>
    3 people authored May 6, 2026
    Configuration menu
    Copy the full SHA
    df6fd36 View commit details
    Browse the repository at this point in the history
  3. ✨ Add gfm-like2 preset with task lists, alerts, and single-tilde st…

    …rikethrough (#388)
    
    ### Summary
    
    Adds a new `gfm-like2` preset that extends `gfm-like` with three GFM
    features:
    
    - **Task lists** — `- [x] done` / `- [ ] todo` checkbox syntax in list
    items
    - **Alerts** — `> [!NOTE]`, `> [!TIP]`, `> [!WARNING]`, etc. inside
    blockquotes
    - **Single-tilde strikethrough** — `~text~` in addition to `~~text~~`
    
    These are enabled via the `gfm-like2` preset or individually through the
    `tasklists`, `alerts`, and `strikethrough_single_tilde` options. The
    existing `gfm-like` preset is unchanged, so as to remain
    back-compatible.
    
    ### Why in markdown-it-py, not mdit-py-plugins?
    
    Task lists and alerts are implemented by integrating detection directly
    into the existing block-level parsers (list.py and blockquote.py),
    rather than as post-processing rules:
    
    - Checkbox detection happens during list item parsing, before the
    sub-parser runs on the item content
    - Alert detection happens during blockquote parsing, before the inner
    content is tokenized
    
    This design is not achievable from a plugin: plugins can only add new
    rules or post-process the token stream — they cannot modify the
    internals of `list_block()` or `blockquote()` to inject detection at the
    right point in the parsing pipeline. Implementing these as
    post-processing core rules would work functionally, but it means
    re-walking and mutating the token stream after the fact, which is less
    clean and less consistent with how the block parsers are designed to
    work.
    
    Single-tilde strikethrough similarly extends the existing strikethrough
    rule's matching logic (opener/closer width matching), which is more
    naturally done inside the rule than bolted on externally.
    
    ### Changes
    
    - **`rules_block/list.py`** — Detect `[ ]`/`[x]`/`[X]` at content start
    during list item parsing; set `token.meta["checked"]`; advance `bMarks`
    past the checkbox; add CSS classes (`task-list-item`,
    `contains-task-list`) after the list loop
    - **`rules_block/blockquote.py`** — Detect `[!TYPE]` on the first
    content line; emit `alert_open`/`alert_close` + title tokens instead of
    `blockquote_open`/`blockquote_close`; skip the marker line during
    tokenization
    - **`rules_inline/strikethrough.py`** — When
    `strikethrough_single_tilde` is enabled, accept 1 or 2 tildes (reject
    3+); enforce opener/closer width matching in `_postProcess`
    - **renderer.py** — Add `list_item_open` render method that injects
    checkbox HTML when `meta["checked"]` is present
    - **`presets/__init__.py`** — Add `gfm_like2` preset class
    - **`main.py`** — Register `gfm-like2` in `_PRESETS`
    - **utils.py** — Add `tasklists`, `alerts`, `strikethrough_single_tilde`
    keys to `OptionsType`
    - **pyproject.toml** — Add `pytest-timeout` to test deps; set 10s
    default timeout
    - **Test fixtures** — 11 tasklist cases, 15 alert cases, 13 single-tilde
    strikethrough cases
    
    ### Usage
    
    ```python
    from markdown_it import MarkdownIt
    
    md = MarkdownIt("gfm-like2")
    md.render("- [x] done\n- [ ] todo")
    md.render("> [!NOTE]\n> This is a note.")
    md.render("~strikethrough~")
    ```
    chrisjsewell authored May 6, 2026
    Configuration menu
    Copy the full SHA
    693bb24 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    8951f26 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    3b4ff6d View commit details
    Browse the repository at this point in the history

Commits on May 7, 2026

  1. ✨ Add make_fence_rule() factory for configurable fence markers (#394)

    Adds a `make_fence_rule()` factory function that generates fence parsing
    rules with configurable options, enabling reuse of the fence logic for
    custom marker characters (e.g. `:` for colon fences) without code
    duplication.
    
    ## Motivation
    
    The `colon_fence` plugin in `mdit-py-plugins` duplicates nearly all of
    the fence parsing logic, differing only in the marker character (`:` vs
    `` ` ``/`~`), token type, and info-string restrictions. This makes
    maintenance harder and prevents features like exact-match closing from
    being shared.
    
    ## Changes
    
    - **New `make_fence_rule()` factory** in fence.py with parameters:
    - `markers`: tuple of valid fence marker characters (default `("~",
    "\`")`)
      - `token_type`: token type name to emit (default `"fence"`)
    - `exact_match`: when `True`, closing fence must have **exactly** the
    same marker count as the opener (default `False`). This enables nested
    fences (e.g. `::::` wrapping `:::`).
    - `disallow_marker_in_info`: marker characters that reject the fence if
    found in the info string (default `("\`",)` per CommonMark)
      - `min_markers`: minimum marker count to form a fence (default `3`)
    
    - **`fence` remains a module-level export**: `fence = make_fence_rule()`
    — fully backwards compatible, identical behavior to the previous
    implementation.
    
    - **Exported from `markdown_it.rules_block`** — added `make_fence_rule`
    to `__all__`.
    
    ## Usage
    
    ```python
    from markdown_it import MarkdownIt
    from markdown_it.rules_block.fence import make_fence_rule
    
    # Colon fence (replaces mdit-py-plugins/colon_fence duplicated logic)
    md = MarkdownIt()
    md.block.ruler.before("fence", "colon_fence",
        make_fence_rule(markers=(":",), token_type="colon_fence", disallow_marker_in_info=()))
    
    # Override standard fence with exact-match closing (for MyST nested directives)
    md.block.ruler.at("fence", make_fence_rule(exact_match=True))
    
    # Combine: all markers with exact matching
    md.block.ruler.at("fence", make_fence_rule(
        markers=("~", "`", ":"), exact_match=True))
    ```
    
    ## Performance
    
    No regression by design — configuration is captured in the closure at
    rule-creation time (no runtime dict lookups or extra branches in the hot
    path). The `closing_matcher` callable is resolved once at factory time.
    
    ## Tests
    
    21 new tests covering:
    - Colon-fence-like marker behavior
    - Exact-match closing (nesting patterns)
    - `ruler.at()` override of standard fence
    - `min_markers` customization
    - `disallow_marker_in_info` variations
    
    Full existing test suite (981 tests) passes unchanged.
    chrisjsewell authored May 7, 2026
    Configuration menu
    Copy the full SHA
    96cf077 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    36c5f54 View commit details
    Browse the repository at this point in the history
Loading