-
Notifications
You must be signed in to change notification settings - Fork 54
Closed
Description
Add a new INI setting SplitLines (default is 1, enabled; 0 switches to legacy Notepad2 / SCI_LINESSPLIT behaviour). If enabled, it changes Edit > Lines > Split Lines (Ctrl+I) command:
- It is enabled regardless if current selection is empty or not.
- It does not use
SCI_LINESSPLIT, instead it uses new vim-like algorithm (but better). In vim, line wrapping is called with gql (current line) and V + ql (selection). - When called with empty selection, enable "open ending" mode: work on area from current line to EOF and stop when have produced a single (full) line or hit EOF. This allows incremental line-by-line wrapping, taking/placing words from/to subsequent lines as needed.
- In the end, the command puts caret after last symbol of the last line in the work area (just like Ctrl+J Join Lines with no selection #160 on empty selection).
- Trailing line breaks are removed from the work area. For example, if I have selected 1 line and 2 empty lines after it, Ctrl+I works as if I selected only 1 line. This is similar to Alt+X (Ignore newlines in Alt+X #30) and others. Doing so allows calling Ctrl+I after selecting complete line(s) using line gutter or Ctrl+Shift+Space, without having to press Shift+Arrow Left to exclude final
\ns from selection. - Trailing whitespace on each line is removed before and after the command (as Alt+W does). In other words, trailing spaces that the work area had should not change the operation, and no trailing spaces should exist in the work area after it ends. Of inner consecutive whitespace only one is kept (like Alt+P but may preserve non-spaces).
Details of the operation:
- This command is comment-aware, i.e. using the same algorithm as Find/Replace - Skip comments mode #303 and Ctrl+Q (hardcoded COMMENT_INFO). Perhaps this issue and Find/Replace - Skip comments mode #303 should be implemented together.
- Start by grouping lines in the work area into paragraphs (each having specific indentation and optional marker), then wrap each paragraph so that its lines do not exceed current Long Lines limit (such as 80). Wrapping respects words where possible just like Column Wrap (Ctrl+Shift+W),
breaking only too long wordsnot breaking long words because they may be special (hashes, URLs, etc.).- Paragraphs are separated by blank lines (which must be preserved; lines consisting of whitespace are also blank because of trimming) or implicitly (depending on their indentation and marker). If two paragraphs have no marker then they can only be separated by blank line(s), else the second paragraph is seen as sub-lines of the first.
- Wrapping works individually on each paragraph, treating all paragraph lines as one stream, i.e. one joined line, not separate lines (ignoring how lines were split in the source). As a result, words can be moved from line to another (but not across paragraphs), see the
ETword in the below example that appeared on line 3 but was moved to line 2. Contents outside of work area is never changed.
- When doing so, preserve indentation and marker among all paragraph lines by prefixing each sub-line with the number of spaces equal to indentation + marker length. For each paragraph, adjust Long Line limit so longer indentation + marker cause less words to fit on a line.
# original text (3 lines, 1 paragraph)
* Lorem ipsum dolor sit amet, consectetur adipiscing
elit, sed do eiusmod tempor incididunt ut labore
ET dolore magna aliqua. Ut enim ad minim veniam,
# wrapped text
* Lorem ipsum dolor sit amet, consectetur adipiscing
elit, sed do eiusmod tempor incididunt ut labore ET
dolore magna aliqua. Ut enim ad minim veniam,
- If first line of the work area is inside a line comment (i.e. inside
//but not/* */), enable special comment mode: for all non-blank lines (i.e. with something other than whitespace after//), determine where line comment starts (this is only for first non-blank line) and the minimal number of spaces after it, then for all lines remove everything before the comment start (such as//, but exact comment token may be different:#,--, etc.), remove the comment start token, remove the aforedetermined number of spaces after the token and perform wrapping as if in non-comment mode (but reduce Long Line limit by length of indentation before//token,//token, spaces after//). After that, prefix all produced lines with the same indentation before//,//and spaces after//.
# original text (C-like scheme)
// Lorem ipsum dolor sit amet, consectetur adipiscing
// elit, sed do eiusmod tempor incididunt ut labore
// et dolore magna aliqua. Ut enim ad minim veniam,
determined:
- line indentation: 4 spaces
- comment token: //
- whitespace after token: 1 space
# cleaned text before wrapping
# paragraph 1, long line limit = 50 - 4 - 2 - 1 - 1
Lorem ipsum dolor sit amet, consectetur adipiscing
# paragraph 2, long line limit = 50 - 4 - 2 - 1 - 2
elit, sed do eiusmod tempor incididunt ut labore
# paragraph 3, long line limit = 50 - 4 - 2 - 1 - 0
et dolore magna aliqua. Ut enim ad minim veniam,
# wrapped text
Lorem ipsum dolor sit amet, consectetur adipiscing
elit, sed do eiusmod tempor incididunt ut labore
et dolore magna aliqua. Ut enim ad minim veniam,
# final text (4 spaces + // + 1 space)
// Lorem ipsum dolor sit amet, consectetur adipiscing
// elit, sed do eiusmod tempor incididunt ut labore
// et dolore magna aliqua. Ut enim ad minim veniam,
- Column Wrap seems to handle this correctly but take note that it is important to recognize paragraphs and bullet points in the text (as described elsewhere) and not join them together. Examples:
# original text
Lorem
Ipsum
# wrapped text (same)
Lorem
Ipsum
# wrong wrapped text
Lorem Ipsum
# original text
* Lorem
* Ipsum
# wrapped text
* Lorem
* Ipsum
# wrong wrapped text
* Lorem * Ipsum
# original text (C-like)
// Lorem
//*Ipsum
# cleaned
Lorem
*Ipsum
# wrapped text
// Lorem
// *Ipsum
- However, Column Wrap preserves line breaks, this is incorrect - before wrapping, an algorithm like Join Paragraphs must be applied to allow rearranging words (but Join Paragraphs does not respect bullet points). You can either implement improved word wrapping for Ctrl+I or change Column Wrap's and/or Join Paragraphs algorithms if it is simpler to reuse them (so Ctrl+Shift+W, Ctrl+Shift+J also work differently from Notepad2 if
SplitLinesis 1).
# original text
aa
bb
* cc
dd
ee
# wrapped text
aa bb
* cc dd
ee
- "Bullet point" is one of the known "markers" placed at the line beginning, having optional whitespace after it. Examples:
* point
#point
> point
# comment mode
// * point
//#point
// > point
- Known markers:
# > = ? *plus special "numeric points" which are ANSI digits + one of: ) ..
123) Lorem
4. Ipsum
- Paragraph lines should be wrapped with respect to their own sub-indentation: when producing line breaks that belong to the same paragraph (i.e. wrapping a sub-line), insert indentation before each such line so that it appears directly under the first line's marker (if any). In other words, sub-lines gain amount of U+0020 spaces equaling to length of the indentation and marker and number of spaces after the marker, before first non-whitespace symbol (usually 1).
# original text
* Lorem ipsum
determined:
- marker = *
- spaces after it = 2
- indentation of sub-lines = 1+2 spaces
# wrapped text (limit = 5)
* Lorem
ipsum
# wrapped text without this consideration
* Lorem
ipsum
# original text
12. Lorem ipsum
determined:
- marker = 12.
- spaces after it = 2
- indentation of sub-lines = 3+2 spaces
# wrapped text (limit = 5)
12. Lorem
ipsum
- When wrapping, ensure that the first word in the sub-line cannot be confused with a marker by a later Ctrl+I. Do this either by¹ pretending that such words are part of the previous word (i.e. are wrapped together, better) or by² moving them to the previous line (may become over limit).
# original text (result must be the same with limit = 5)
* *a *b
# incorrect result
* *a
*b
# original text
* a b *c
# wrapped text¹
* a
b *c
# alternative wrapped text²
* a b *c
- Note: bullet points may be exempt from line limit, i.e. the algorithm may consider bullet point marker monolithic and not subject to wrapping. For example, if this simplifies implementation then
1234567.can be never wrapped or broken and can exceed Long Line limit. - This command and algorithms that it uses (Join Paragraphs, etc.) should be Unicode-safe.
- Finally, try to separate this feature (i.e. wrapping algorithms) so it is possible to write unit tests to test different cases, as presented in this issue.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels