Skip to content

Change Ctrl+I to perform comment-aware line wrapping #320

@ProgerXP

Description

@ProgerXP

Add a new INI setting SplitLines (default is 1, enabled; 0 switches to legacy Notepad2 / SCI_LINESSPLIT behaviour). If enabled, it changes Edit > Lines > Split Lines (Ctrl+I) command:

  • It is enabled regardless if current selection is empty or not.
  • It does not use SCI_LINESSPLIT, instead it uses new vim-like algorithm (but better). In vim, line wrapping is called with gql (current line) and V + ql (selection).
  • When called with empty selection, enable "open ending" mode: work on area from current line to EOF and stop when have produced a single (full) line or hit EOF. This allows incremental line-by-line wrapping, taking/placing words from/to subsequent lines as needed.
  • In the end, the command puts caret after last symbol of the last line in the work area (just like Ctrl+J Join Lines with no selection #160 on empty selection).
  • Trailing line breaks are removed from the work area. For example, if I have selected 1 line and 2 empty lines after it, Ctrl+I works as if I selected only 1 line. This is similar to Alt+X (Ignore newlines in Alt+X #30) and others. Doing so allows calling Ctrl+I after selecting complete line(s) using line gutter or Ctrl+Shift+Space, without having to press Shift+Arrow Left to exclude final \ns from selection.
  • Trailing whitespace on each line is removed before and after the command (as Alt+W does). In other words, trailing spaces that the work area had should not change the operation, and no trailing spaces should exist in the work area after it ends. Of inner consecutive whitespace only one is kept (like Alt+P but may preserve non-spaces).

Details of the operation:

  • This command is comment-aware, i.e. using the same algorithm as Find/Replace - Skip comments mode #303 and Ctrl+Q (hardcoded COMMENT_INFO). Perhaps this issue and Find/Replace - Skip comments mode #303 should be implemented together.
  • Start by grouping lines in the work area into paragraphs (each having specific indentation and optional marker), then wrap each paragraph so that its lines do not exceed current Long Lines limit (such as 80). Wrapping respects words where possible just like Column Wrap (Ctrl+Shift+W), breaking only too long words not breaking long words because they may be special (hashes, URLs, etc.).
    • Paragraphs are separated by blank lines (which must be preserved; lines consisting of whitespace are also blank because of trimming) or implicitly (depending on their indentation and marker). If two paragraphs have no marker then they can only be separated by blank line(s), else the second paragraph is seen as sub-lines of the first.
    • Wrapping works individually on each paragraph, treating all paragraph lines as one stream, i.e. one joined line, not separate lines (ignoring how lines were split in the source). As a result, words can be moved from line to another (but not across paragraphs), see the ET word in the below example that appeared on line 3 but was moved to line 2. Contents outside of work area is never changed.
  • When doing so, preserve indentation and marker among all paragraph lines by prefixing each sub-line with the number of spaces equal to indentation + marker length. For each paragraph, adjust Long Line limit so longer indentation + marker cause less words to fit on a line.
# original text (3 lines, 1 paragraph)
  * Lorem ipsum dolor sit amet, consectetur adipiscing
    elit, sed do eiusmod tempor incididunt ut labore
    ET dolore magna aliqua. Ut enim ad minim veniam,  

# wrapped text
  * Lorem ipsum dolor sit amet, consectetur adipiscing 
    elit, sed do eiusmod tempor incididunt ut labore ET 
    dolore magna aliqua. Ut enim ad minim veniam,
  • If first line of the work area is inside a line comment (i.e. inside // but not /* */), enable special comment mode: for all non-blank lines (i.e. with something other than whitespace after //), determine where line comment starts (this is only for first non-blank line) and the minimal number of spaces after it, then for all lines remove everything before the comment start (such as //, but exact comment token may be different: #, --, etc.), remove the comment start token, remove the aforedetermined number of spaces after the token and perform wrapping as if in non-comment mode (but reduce Long Line limit by length of indentation before // token, // token, spaces after //). After that, prefix all produced lines with the same indentation before //, // and spaces after //.
# original text (C-like scheme)
    //  Lorem ipsum dolor sit amet, consectetur adipiscing
   //   elit, sed do eiusmod tempor incididunt ut labore
// et dolore magna aliqua. Ut enim ad minim veniam,

determined:
- line indentation: 4 spaces
- comment token: //
- whitespace after token: 1 space

# cleaned text before wrapping
# paragraph 1, long line limit = 50 - 4 - 2 - 1 - 1
Lorem ipsum dolor sit amet, consectetur adipiscing
# paragraph 2, long line limit = 50 - 4 - 2 - 1 - 2
elit, sed do eiusmod tempor incididunt ut labore
# paragraph 3, long line limit = 50 - 4 - 2 - 1 - 0
et dolore magna aliqua. Ut enim ad minim veniam,

# wrapped text
 Lorem ipsum dolor sit amet, consectetur adipiscing
  elit, sed do eiusmod tempor incididunt ut labore 
et dolore magna aliqua. Ut enim ad minim veniam,

# final text (4 spaces + // + 1 space)
    //  Lorem ipsum dolor sit amet, consectetur adipiscing
    //   elit, sed do eiusmod tempor incididunt ut labore 
    // et dolore magna aliqua. Ut enim ad minim veniam,
  • Column Wrap seems to handle this correctly but take note that it is important to recognize paragraphs and bullet points in the text (as described elsewhere) and not join them together. Examples:
# original text
    Lorem
    
    Ipsum
    
# wrapped text (same)
    Lorem
    
    Ipsum

# wrong wrapped text
    Lorem Ipsum
# original text
  * Lorem
* Ipsum

# wrapped text
  * Lorem
  * Ipsum

# wrong wrapped text
  * Lorem * Ipsum
# original text (C-like)
    // Lorem
      //*Ipsum

# cleaned
Lorem
*Ipsum

# wrapped text
    // Lorem 
    // *Ipsum
  • However, Column Wrap preserves line breaks, this is incorrect - before wrapping, an algorithm like Join Paragraphs must be applied to allow rearranging words (but Join Paragraphs does not respect bullet points). You can either implement improved word wrapping for Ctrl+I or change Column Wrap's and/or Join Paragraphs algorithms if it is simpler to reuse them (so Ctrl+Shift+W, Ctrl+Shift+J also work differently from Notepad2 if SplitLines is 1).
# original text
aa
bb
* cc
dd

ee

# wrapped text
aa bb 
* cc dd

ee
  • "Bullet point" is one of the known "markers" placed at the line beginning, having optional whitespace after it. Examples:
* point
#point
>  point

# comment mode
    // * point
    //#point
    //   >   point
  • Known markers: # > = ? * plus special "numeric points" which are ANSI digits + one of : ) ..
123) Lorem
  4.   Ipsum
  • Paragraph lines should be wrapped with respect to their own sub-indentation: when producing line breaks that belong to the same paragraph (i.e. wrapping a sub-line), insert indentation before each such line so that it appears directly under the first line's marker (if any). In other words, sub-lines gain amount of U+0020 spaces equaling to length of the indentation and marker and number of spaces after the marker, before first non-whitespace symbol (usually 1).
# original text
*  Lorem ipsum

determined:
- marker = *
- spaces after it = 2
- indentation of sub-lines = 1+2 spaces

# wrapped text (limit = 5)
*  Lorem
   ipsum

# wrapped text without this consideration
*  Lorem
ipsum
# original text
12.  Lorem ipsum

determined:
- marker = 12.
- spaces after it = 2
- indentation of sub-lines = 3+2 spaces

# wrapped text (limit = 5)
12.  Lorem
     ipsum
  • When wrapping, ensure that the first word in the sub-line cannot be confused with a marker by a later Ctrl+I. Do this either by¹ pretending that such words are part of the previous word (i.e. are wrapped together, better) or by² moving them to the previous line (may become over limit).
# original text (result must be the same with limit = 5)
* *a *b

# incorrect result
* *a
  *b

# original text 
* a b *c

# wrapped text¹
* a
  b *c
  
# alternative wrapped text² 
* a b *c
  • Note: bullet points may be exempt from line limit, i.e. the algorithm may consider bullet point marker monolithic and not subject to wrapping. For example, if this simplifies implementation then 1234567. can be never wrapped or broken and can exceed Long Line limit.
  • This command and algorithms that it uses (Join Paragraphs, etc.) should be Unicode-safe.
  • Finally, try to separate this feature (i.e. wrapping algorithms) so it is possible to write unit tests to test different cases, as presented in this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions