Skip to content

Fix inline break opportunities across boxes#2748

Open
moreaki wants to merge 2 commits into
Kozea:mainfrom
moreaki:fix-inline-break-opportunities
Open

Fix inline break opportunities across boxes#2748
moreaki wants to merge 2 commits into
Kozea:mainfrom
moreaki:fix-inline-break-opportunities

Conversation

@moreaki
Copy link
Copy Markdown
Contributor

@moreaki moreaki commented Apr 22, 2026

What changed

This updates inline line breaking so overflow-wrap: break-word decides whether a text fragment is at the start of an otherwise unbreakable sequence by checking for real break opportunities already present on the line, instead of only checking whether this is the first box on the line. The state is also carried into nested inline boxes so a previous break opportunity outside a span is still visible when laying out text inside that span.

That fixes the linked case from #1614 where a single long word is split across a text node and an inline box, for example aaaaaaaa...<span>bbbb...</span>. The text can now wrap inside the continued word instead of overflowing because the span boundary was treated too much like a word boundary.

The PR also adds regression coverage for related inline-boundary reports from #2102, #2025, and #2308. Current main already appears to handle the concrete #2308 example, but the test pins the behavior so it does not regress.

Historical context

The issue chain is:

There is also the older draft #1840, which explores a much broader line-breaking rewrite using Harfbuzz data. Daniel Fitzpatrick also has a fork commit for the same root problem: danfitz36@f9e68e2. This PR is intentionally much smaller than #1840, and now also carries the previous fragment state into nested inlines as Daniel’s commit does.

What this does not claim

This is not full CSS Text line-breaking compliance. It does not implement the complete CSS Text Level 3 model, nor does it solve every interaction between inline boundaries, shaping, bidi, CJK line-breaking rules, hyphenation, word-break, line-break, and all overflow-wrap cases.

The goal here is narrower: fix the immediate inline-boundary triggers reported in #1614 and #2102, and keep the related #2025/#2308 examples covered by tests, while leaving the broader line-breaking universe for a larger implementation.

Tests

  • venv/bin/python -m ruff check weasyprint/layout/inline.py tests/test_text.py
  • venv/bin/python -m pytest tests/test_text.py
  • venv/bin/python -m pytest tests/layout/test_inline.py
  • venv/bin/python -m pytest -> 4029 passed, 40 xfailed

Addresses #1614 and #2102.
Related to #2025 and #2308.

@moreaki
Copy link
Copy Markdown
Contributor Author

moreaki commented Apr 22, 2026

Follow-up / prior-art note: I found Daniel Fitzpatrick’s fork commit danfitz36@f9e68e2, dated 2026-04-04, which addresses the same root is_line_start problem for inline span boundaries. I didn’t find an open or closed PR from that branch against this repository, and it has not been merged here.

That commit and this PR use the same underlying idea: overflow-wrap: break-word should decide whether a fragment is still in the first otherwise-unbreakable sequence from actual break opportunities, not from “first box on the line”. Daniel’s patch threads the previous last_letter through split_inline_box / split_inline_level; this PR keeps the existing signatures and computes the state from the already-laid-out line children via line_has_break(), while also sharing the existing “can break between these fragments” logic.

Daniel’s commit also references #2102, which is another nested-span word-breaking report related to #1614. If the maintainers prefer the explicit last_letter threading approach, that fork commit is a useful alternative/reference. If this PR stays as-is, I’m happy to add an extra #2102 regression test too.

@moreaki
Copy link
Copy Markdown
Contributor Author

moreaki commented Apr 22, 2026

I updated the PR with a #2102 regression and extended the implementation to carry previous-fragment state into nested inline boxes. This makes the #2102 span-wrapped long-word case break identically to the same text without the span.

Re-ran validation after the update:

  • venv/bin/python -m ruff check weasyprint/layout/inline.py tests/test_text.py
  • venv/bin/python -m pytest tests/test_text.py
  • venv/bin/python -m pytest tests/layout/test_inline.py
  • venv/bin/python -m pytest -> 4029 passed, 40 xfailed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant