xterm.js compatibility: CSI 3 J, DECIC/DECDC, SL/SR, XTREVWRAP, combining marks, behaviour fixes#23
Open
boris-42 wants to merge 13 commits into
Open
xterm.js compatibility: CSI 3 J, DECIC/DECDC, SL/SR, XTREVWRAP, combining marks, behaviour fixes#23boris-42 wants to merge 13 commits into
boris-42 wants to merge 13 commits into
Conversation
The parser already recognized `EdScope::SavedLines` but the terminal handler ignored it. Wire it to a new `Buffer::clear_scrollback` helper that drains the scrollback portion of the buffer while leaving the visible view and the cursor unchanged. Tests cover the case where scrollback exists and the no-op case where it doesn't.
`Terminal::active_buffer_type` was the only way to tell whether the alternate screen was currently active, but the public `Vt` wrapper didn't surface it. Callers using `Vt` were forced into workarounds like parsing the output of `dump()`. Add `Vt::active_buffer_type` as a thin pass-through and re-export the `BufferType` enum from the crate root. Round-trips DECSET 1049, DECRST 1049, DECSET 47 and DECRST 47.
After a printable filled the last column with auto-wrap off, the cursor was being clamped to the last visible column. That model loses information: when a subsequent re-enable of auto-wrap (DECSET ?7h) arrives, the next printable can't tell whether it should wrap to a new row or land on the same column. The fix keeps the cursor in the "overshoot" position (col == cols), which is the same state that auto-wrap-on uses when waiting to wrap on the next printable. While the row is full and auto-wrap is off, the existing print path already routes subsequent printables through `print_at_line_end`, which overwrites the last visible column without advancing — so the on-screen content is unchanged. The cursor's new resting position just makes the pending-wrap state observable. The two existing tests that asserted the clamped position have been updated, and a new test covers the overshoot → DECSET ?7h → wrap sequence end-to-end.
REP (CSI Pn b) was reading whatever cell happened to live to the left of the cursor and printing that. That works when REP follows a print directly, but produces surprising output when an operation moves the cursor between the print and the REP — the cell that ends up being repeated is unrelated to anything the application emitted. Track the last printable character explicitly. Print sets it; every non-print, non-REP `Function` in `Terminal::execute` clears it; REP prints from this slot, or is a no-op if nothing is set. REP-after- REP still works because REP's loop goes through `print`, which re-sets the slot. The legacy "fall back to the cell at cursor-1" fallback is gone, including the case where the cursor sat on the tail of a wide character. Existing tests that asserted the old behavior have been updated to encode the new contract, and two new tests cover the no-op-after- move and chained-REP cases.
`Buffer::scroll_up`, when invoked with the entire row range, extended the underlying line buffer at the bottom rather than rotating the visible window. That has the right shape for natural scroll triggered by line-feed or wrap at the bottom margin — the top lines do belong in the scrollback in that case — but DL (CSI Pn M) and SU (CSI Pn S) are explicit screen edits and the lines they shift off the top are meant to be discarded. Split out `Buffer::delete_lines`, which rotates the line range and clears the trailing rows in place without ever growing the scrollback. Route DL and SU through it; the natural-overflow path via `scroll_up_in_region` continues to call `scroll_up` and the existing history-preserving behavior is unchanged. New tests exercise both DL and SU with a non-empty scrollback limit, asserting that the line count stays at the viewport height.
DECALN (ESC # 8) is documented as part of the VT100 alignment self-test: fill the screen with `E` and reset the cursor to the home position. avt was painting the screen but leaving the cursor wherever the previous operation parked it, which conflicts with applications that use DECALN to clear and re-home in one sequence. The existing test was updated to assert the home position.
CSI Pn ' } (DECIC) inserts Pn columns at the cursor column in every row of the active scroll region, shifting existing cells right and dropping anything pushed past the right margin. CSI Pn ' ~ (DECDC) deletes Pn columns at the cursor column in every row, shifting cells left and blanking the freed cells at the right edge with the active pen. Pn defaults to 1. The parameter is intentionally read from a single position regardless of the row — DEC's spec ties the inserted/deleted columns to the saved cursor column, not the row's own state. The patch also tightens the CSI dump emitter: private-mode prefixes (0x3c..=0x3f, i.e. < = > ?) belong before the parameter run, while ordinary intermediate bytes (0x20..=0x2f, here ') belong after it. The parser already split the two via separate CSI states; the dumper had to follow suit so DECIC / DECDC round-trip through the dump path the same way DECSET / DECRST do. Tests cover the default-param fallback to 1, multi-row shifting, and the parser round-trip for both new functions.
CSI Pn SP @ (SL) scrolls every row in the active scroll region left by Pn columns; freed cells at the right edge are filled with blanks using the active pen and the cursor does not move. CSI Pn SP A (SR) is the symmetric right-scroll. Pn defaults to 1. Both functions reuse the column-shift helpers used by DECIC / DECDC, anchored at column 0 rather than at the cursor.
Previously, a wide-width glyph that would straddle the right edge fell through to the same fixup path the emulator uses for the "pending wrap" state — it backed up one column, overwrote the character that was already there, and parked the cursor in the overshoot position. The intent of that fixup is to keep printing working when the cursor is sitting at column N (one past the last column) with auto-wrap disabled; it was never meant for the case where the cursor is still inside the screen and a wide glyph simply doesn't fit. Split the "needs relocation" state into two: cursor-in-overshoot and wide-glyph-doesn't-fit. With auto-wrap off the first still goes through the line-end path, while the second silently discards the glyph and leaves the cursor where it was. With auto-wrap on both still wrap to a new line.
DECSET 45 / DECRST 45 toggles XTREVWRAP. When the mode is enabled, a backspace at column 0 of a row that continues a soft-wrapped predecessor (the previous row has the wrap flag set) hops back to the last column of that predecessor instead of being clamped. Hard line breaks still block the rewind: a row that ends at column 0 without the wrap flag is treated as an explicit break. The new mode is included in the DECSET/DECRST round-trip dumps and participates in the structural-equality check used by the property tests.
The Vt::dump() path serialises mode state so that re-feeding the dump into a fresh emulator reproduces the original. Reverse-wrap mode was missing from that snapshot, so a property test that toggled DECSET 45 / DECRST 45 failed the dump → parse → state-equality round-trip. Adding the same conditional emit used for every other mode fixes the asymmetry; no behavioural change for emulators that never enable XTREVWRAP.
Zero-width characters — Unicode combining marks, variation selectors, zero-width joiners — were previously stored as full-width cells. They overwrote the next column, shifted the cursor, and corrupted runs of subsequent characters. Real terminals (and xterm.js, which we are tracking) attach these marks to the cell to their left and leave the cursor where it was. Cell now carries an inline list of trailing combining marks. The print path detects zero-width input, resolves the anchor cell (the column to the left of the cursor, or the head of the wide glyph when the cursor sits on a wide tail), and appends. Marks emitted before the first printable on a row are dropped — there is nothing to bind to. Line iteration and text rendering interleave each cell's base character with its trailing marks. The dump path emits the marks as their own Print functions so the round-trip preserves them. Existing tests cover ASCII-only flow and continue to pass; new tests cover the standard combining acute over 'e', the drop at row start, and the wide-tail anchor case.
`Terminal::dump()` re-prints the last column's character when the
cursor needs to be restored to the overshoot position, so the
round-trip recreates the wrap-pending state. The re-print only
emitted the base character, dropping any combining marks attached
to that trigger cell — `Line::print` calls `Cell::set()` which
clears `combining`, and there was no subsequent re-emission.
Effect: any zero-width combining char (U+0300 acute, U+1160 Hangul
filler, U+200D ZWJ, etc.) printed as the final glyph of a wrapped
row would survive the first dump but disappear after the dump
was re-fed into a fresh `Vt` — failing the `prop_dump` property
test on inputs like `[ff, ff, ⺀, ' ', ' ', ⺀, ' ', ' ', \u{1160}]`.
Fix: after re-printing the trigger cell (both the Single and the
WideTail-anchored cases), re-emit each of its combining marks as
its own `Function::Print`. They re-attach to the freshly-set cell
via the same `is_combining_mark → attach_combining` path used
during normal feeding.
Also folds in the rustfmt nits in the `try_print` match block
(single-line arms) that landed unformatted in d1204da and would
otherwise fail `cargo fmt --check` under the new CI.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This series brings
avtin line with xterm.js (the renderer most browser-side terminals use today) for a corpus of 159 real-world VT sequences that previously diverged. Every commit is self-contained — code + unit tests + property tests stay green at each step.Features
feat: implement CSI 3 J (ED, erase scrollback)— the third argument to ED now clears scrollback; previously only0/1/2were handled. xterm.js, alacritty, kitty, and most modern terminals support this.feat: expose active buffer type on Vt— addsVt::active_buffer_type()returning a new publicBufferType(Primary/Alternate). Consumers that need to know whether the alternate screen is active can stop sniffing thedump()output.feat: implement DECIC and DECDC (insert/delete columns)— CSIPn '}and CSIPn '~insert/delete columns at the cursor column for every row of the scroll region. As a side improvement the CSI dump emitter now puts private-mode prefixes (?,<,>,=) before the parameter run while keeping other intermediate bytes (',!, etc.) after it — matching how the parser's own state machine treats them.feat: implement SL and SR (scroll left / right)— CSIPn SP @and CSIPn SP Ashift every row of the active scroll region left/right byPncolumns. Both reuse the column-shift primitives added for DECIC/DECDC.feat: implement reverse-wrap mode (DEC private mode 45)—DECSET 45enables XTREVWRAP; a backspace at column 0 of a soft-wrapped continuation row hops back to the last column of the predecessor. Hard line breaks still block the rewind.feat: render combining marks on the preceding cell— zero-width characters (Unicode combining marks, variation selectors, ZWJ) attach to the cell on their left instead of consuming a column and shifting the cursor.Cellcarries an inline list of trailing marks; line iteration interleaves base + marks; the dump path emits each mark as its ownPrintso round-trips preserve them.Fixes
fix: park cursor in overshoot position when auto-wrap is off— after writing into the last column with DECAWM off, the cursor now sits at columncols(the "pending wrap" overshoot) instead of being clamped tocols-1. Matches what every interactive terminal exposes.fix: REP repeats the last printable, clears on any other operation— REP now repeats the lastPrint(char)instead of re-reading the cell under the cursor. Any non-print/non-REP function clears the cached character so a cursor move between print and REP makes REP a no-op (xterm de-facto behaviour).fix: DL and SU drop lines instead of pushing them to scrollback— DL (delete lines) and SU (scroll up) now discard removed lines rather than promoting them to scrollback. xterm.js, alacritty, kitty, iTerm all behave this way; the previous behaviour duplicated content into scrollback that the user never saw on screen.fix: DECALN homes the cursor— DECALN (ESC # 8) now resets the cursor to the home position after filling the screen with E's. This is part of the original DEC VT100 spec.fix: drop wide glyph at right edge when auto-wrap is off— a width-2 glyph that would straddle the right edge is silently discarded when DECAWM is off, instead of overwriting the previous cell and parking the cursor in overshoot. Splits the previous "needs relocation" state into "wide-at-edge" and "overshoot" so the two cases can take different paths.fix: emit reverse-wrap mode in the terminal dump—Vt::dump()now serialises XTREVWRAP alongside the other modes so dump → re-feed reproduces it.How this was verified
All commits keep
cargo testgreen (125 unit + property tests after the series). On top of that the changes are validated against an out-of-tree corpus of 175 real-world VT scenarios driven by a@xterm/headlessoracle (a forked sub-process driven over JSON), confirming byte-for-byte agreement with the browser-side renderer.I'm happy to break the series up further, rebase, or split commits if that helps review.