Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LaTeX: optionally apply a second forceful wrapping of long code lines #8854

Merged
merged 9 commits into from
Feb 9, 2021
1 change: 1 addition & 0 deletions CHANGES
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@ Bugs fixed
* #8780: LaTeX: long words in narrow columns may not be hyphenated
* #8788: LaTeX: ``\titleformat`` last argument in sphinx.sty should be
bracketed, not braced (and is anyhow not needed)
* #8849: LaTex: code-block printed out of margin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you commented by yourself, it would be better to mention new configurations here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing, done. I will merge after testing completes. The behaviour is "opt-in" so no change to users if not using the feature.


Testing
--------
Expand Down
63 changes: 58 additions & 5 deletions doc/latex.rst
Original file line number Diff line number Diff line change
Expand Up @@ -585,7 +585,9 @@ The below is included at the end of the chapter::

\endgroup

LaTeX boolean keys require *lowercase* ``true`` or ``false`` values.
LaTeX syntax for boolean keys require *lowercase* ``true`` or ``false``
e.g ``'sphinxsetup': "verbatimwrapslines=false"``. If setting the
boolean key to ``true``, ``=true`` is optional.
Spaces around the commas and equal signs are ignored, spaces inside LaTeX
macros may be significant.

Expand Down Expand Up @@ -636,14 +638,65 @@ macros may be significant.
Boolean to specify if long lines in :rst:dir:`code-block`\ 's contents are
wrapped.

If ``true``, line breaks may happen at spaces (the last space before the
line break will be rendered using a special symbol), and at ascii
punctuation characters (i.e. not at letters or digits). Whenever a long
string has no break points, it is moved to next line. If its length is
longer than the line width it will overflow.

Default: ``true``

``literalblockcappos``
Decides the caption position: either ``b`` ("bottom") or ``t`` ("top").
``verbatimforcewraps``
Boolean to specify if long lines in :rst:dir:`code-block`\ 's contents
should be forcefully wrapped to never overflow due to long strings.

Default: ``t``
.. note::

.. versionadded:: 1.7
It is assumed that the Pygments_ LaTeXFormatter has not been used with
its ``texcomments`` or similar options which allow additional
(arbitrary) LaTeX mark-up.

Also, in case of :confval:`latex_engine` set to ``'pdflatex'``, only
the default LaTeX handling of Unicode code points, i.e. ``utf8`` not
``utf8x`` is allowed.

.. _Pygments: https://pygments.org/

Default: ``false``

.. versionadded:: 3.5.0

``verbatimmaxoverfull``
A number. If an unbreakable long string has length larger than the total
linewidth plus this number of characters, and if ``verbatimforcewraps``
mode is on, the input line will be reset using the forceful algorithm
which applies breakpoints at each character.

Default: ``3``

.. versionadded:: 3.5.0

``verbatimmaxunderfull``
A number. If ``verbatimforcewraps`` mode applies, and if after applying
the line wrapping at spaces and punctuation, the first part of the split
line is lacking at least that number of characters to fill the available
width, then the input line will be reset using the forceful algorithm.

As the default is set to a high value, the forceful algorithm is triggered
only in overfull case, i.e. in presence of a string longer than full
linewidth. Set this to ``0`` to force all input lines to be hard wrapped
at the current avaiable linewidth::

latex_elements = {
'sphinxsetup': "verbatimforcewraps, verbatimmaxunderfull=0",
}

This can be done locally for a given code-block via the use of raw latex
directives to insert suitable ``\sphinxsetup`` into the latex file.

Default: ``100``

.. versionadded:: 3.5.0

``verbatimhintsturnover``
Boolean to specify if code-blocks display "continued on next page" and
Expand Down
183 changes: 180 additions & 3 deletions sphinx/texinputs/sphinx.sty
Original file line number Diff line number Diff line change
Expand Up @@ -334,6 +334,9 @@
% verbatim
\DeclareBoolOption[true]{verbatimwithframe}
\DeclareBoolOption[true]{verbatimwrapslines}
\DeclareBoolOption[false]{verbatimforcewraps}
\DeclareStringOption[3]{verbatimmaxoverfull}
\DeclareStringOption[100]{verbatimmaxunderfull}
\DeclareBoolOption[true]{verbatimhintsturnover}
\DeclareBoolOption[true]{inlineliteralwraps}
\DeclareStringOption[t]{literalblockcappos}
Expand Down Expand Up @@ -1171,13 +1174,187 @@
% no need to restore \fboxsep here, as this ends up in a \hbox from fancyvrb
}%
% \sphinxVerbatimFormatLine will be set locally to one of those two:
\newcommand\sphinxVerbatimFormatLineWrap[1]{%
\hsize\linewidth
\newcommand\sphinxVerbatimFormatLineWrap{%
\hsize\linewidth
\ifspx@opt@verbatimforcewraps
\expandafter\spx@verb@FormatLineForceWrap
\else\expandafter\spx@verb@FormatLineWrap
\fi
}%
\newcommand\sphinxVerbatimFormatLineNoWrap[1]{\hb@xt@\linewidth{\strut #1\hss}}%
\long\def\spx@verb@FormatLineWrap#1{%
\vtop{\raggedright\hyphenpenalty\z@\exhyphenpenalty\z@
\doublehyphendemerits\z@\finalhyphendemerits\z@
\strut #1\strut}%
}%
\newcommand\sphinxVerbatimFormatLineNoWrap[1]{\hb@xt@\linewidth{\strut #1\hss}}%
%
% The normal line wrapping allows breaks at spaces and ascii non
% letters, non digits. The \raggedright above means there will be
% an overfilled line only if some non-breakable "word" was
% encountered, which is longer than a line (it is moved always to
% be on its own on a new line).
%
% The "forced" line wrapping will parse the tokens to add potential
% breakpoints at each character. As some strings are highlighted,
% we have to apply the highlighting character per character, which
% requires to manipulate the output of the Pygments LaTeXFormatter.
%
% Doing this at latex level is complicated. The contents should
% be as expected: i.e. some active characters from
% \sphinxbreaksviaactive, some Pygments character escapes such as
% \PYGZdl{}, and the highlighting \PYG macro with always 2
% arguments. No other macros should be there, except perhaps
% zero-parameter macros. In particular:
% - the texcomments Pygments option must be set to False
%
% With pdflatex, Unicode input gives multi-bytes characters
% where the first byte is active. We support the "utf8" macros
% only. "utf8x" is not supported.
%
% The highlighting macro \PYG will be applied character per
% character. Highlighting via a colored background gives thus a
% chain of small colored boxes which may cause some artefact in
% some pdf viewers. Can't do anything here if we do want the line
% break to be possible.
%
% First a measurement step is done of what would the standard line
% wrapping give (i.e line breaks only at spaces and non-letter,
% non-digit ascii characters), cf TeX by Topic for the basic
% dissecting technique: TeX unfortunately when building a vertical
% box does not store in an accessible way what was the maximal
% line-width during paragraph building.
%
% If the max width exceed the linewidth by at least 4 character
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 3 in 4=3+1 is in truth the value of verbatimmaxoverfull key to 'sphinxsetup'

% widths, then we apply the "force wrapping" with potential line
% break at each character, else we don't.
\long\def\spx@verb@FormatLineForceWrap#1{%
% \spx@image@box is a scratch box register that we can use here
\global\let\spx@verb@maxwidth\z@
\global\let\spx@verb@minwidth\linewidth
\setbox\spx@image@box
\vtop{\raggedright\hyphenpenalty\z@\exhyphenpenalty\z@
\doublehyphendemerits\z@\finalhyphendemerits\z@
\strut #1\strut\@@par
\spx@verb@getwidths}%
\ifdim\spx@verb@maxwidth>
\dimexpr\linewidth+\spx@opt@verbatimmaxoverfull\fontcharwd\font`X \relax
\spx@verb@FormatLineWrap{\spx@verb@wrapPYG #1\spx@verb@wrapPYG}%
\else
\ifdim\spx@verb@minwidth<
\dimexpr\linewidth-\spx@opt@verbatimmaxunderfull\fontcharwd\font`X \relax
\spx@verb@FormatLineWrap{\spx@verb@wrapPYG #1\spx@verb@wrapPYG}%
\else
\spx@verb@FormatLineWrap{#1}%
\fi\fi
}%
% auxiliary paragraph dissector to get max and min widths
\newbox\spx@scratchbox
\def\spx@verb@getwidths {%
\unskip\unpenalty
\setbox\spx@scratchbox\lastbox
\ifvoid\spx@scratchbox
\else
\setbox\spx@scratchbox\hbox{\unhbox\spx@scratchbox}%
\ifdim\spx@verb@maxwidth<\wd\spx@scratchbox
\xdef\spx@verb@maxwidth{\number\wd\spx@scratchbox sp}%
\fi
\ifdim\spx@verb@minwidth>\wd\spx@scratchbox
\xdef\spx@verb@minwidth{\number\wd\spx@scratchbox sp}%
\fi
\expandafter\spx@verb@getwidths
\fi
}%
% auxiliary macros to implement "cut long line even in middle of word"
\catcode`Z=3 % safe delimiter
\def\spx@verb@wrapPYG{%
\futurelet\spx@nexttoken\spx@verb@wrapPYG@i
}%
\def\spx@verb@wrapPYG@i{%
\ifx\spx@nexttoken\spx@verb@wrapPYG\let\next=\@gobble\else
\ifx\spx@nexttoken\PYG\let\next=\spx@verb@wrapPYG@PYG@onebyone\else
\discretionary{}{\sphinxafterbreak}{}%
\let\next\spx@verb@wrapPYG@ii
\fi\fi
\next
}%
% Let's recognize active characters. We don't support utf8x only utf8.
% And here #1 should not have picked up (non empty) braced contents
\long\def\spx@verb@wrapPYG@ii#1{%
\ifcat\noexpand~\noexpand#1\relax% active character
\expandafter\spx@verb@wrapPYG@active
\else % non-active character, control sequence such as \PYGZdl, or empty
\expandafter\spx@verb@wrapPYG@one
\fi {#1}%
}%
\long\def\spx@verb@wrapPYG@active#1{%
% Let's hope expansion of active character does not really require arguments,
% as we certainly don't want to go into expanding upfront token stream anyway.
\expandafter\spx@verb@wrapPYG@iii#1{}{}{}{}{}{}{}{}{}Z#1%
}%
\long\def\spx@verb@wrapPYG@iii#1#2Z{%
\ifx\UTFviii@four@octets#1\let\next=\spx@verb@wrapPYG@four\else
\ifx\UTFviii@three@octets#1\let\next=\spx@verb@wrapPYG@three\else
\ifx\UTFviii@two@octets#1\let\next=\spx@verb@wrapPYG@two\else
\let\next=\spx@verb@wrapPYG@one
\fi\fi\fi
\next
}%
\long\def\spx@verb@wrapPYG@one #1{#1\futurelet\spx@nexttoken\spx@verb@wrapPYG@i}%
\long\def\spx@verb@wrapPYG@two #1#2{#1#2\futurelet\spx@nexttoken\spx@verb@wrapPYG@i}%
\long\def\spx@verb@wrapPYG@three #1#2#3{#1#2#3\futurelet\spx@nexttoken\spx@verb@wrapPYG@i}%
\long\def\spx@verb@wrapPYG@four #1#2#3#4{#1#2#3#4\futurelet\spx@nexttoken\spx@verb@wrapPYG@i}%
% Replace \PYG by itself applied one character at a time! This way breakpoints
% can be inserted.
\def\spx@verb@wrapPYG@PYG@onebyone#1#2#3{% #1 = \PYG, #2 = highlight spec, #3 = tokens
\def\spx@verb@wrapPYG@PYG@spec{{#2}}%
\futurelet\spx@nexttoken\spx@verb@wrapPYG@PYG@i#3Z%
}%
\def\spx@verb@wrapPYG@PYG@i{%
\ifx\spx@nexttokenZ\let\next=\spx@verb@wrapPYG@PYG@done\else
\discretionary{}{\sphinxafterbreak}{}%
\let\next\spx@verb@wrapPYG@PYG@ii
\fi
\next
}%
\def\spx@verb@wrapPYG@PYG@doneZ{\futurelet\spx@nexttoken\spx@verb@wrapPYG@i}%
\long\def\spx@verb@wrapPYG@PYG@ii#1{%
\ifcat\noexpand~\noexpand#1\relax% active character
\expandafter\spx@verb@wrapPYG@PYG@active
\else % non-active character, control sequence such as \PYGZdl, or empty
\expandafter\spx@verb@wrapPYG@PYG@one
\fi {#1}%
}%
\long\def\spx@verb@wrapPYG@PYG@active#1{%
% Let's hope expansion of active character does not really require arguments,
% as we certainly don't want to go into expanding upfront token stream anyway.
\expandafter\spx@verb@wrapPYG@PYG@iii#1{}{}{}{}{}{}{}{}{}Z#1%
}%
\long\def\spx@verb@wrapPYG@PYG@iii#1#2Z{%
\ifx\UTFviii@four@octets#1\let\next=\spx@verb@wrapPYG@PYG@four\else
\ifx\UTFviii@three@octets#1\let\next=\spx@verb@wrapPYG@PYG@three\else
\ifx\UTFviii@two@octets#1\let\next=\spx@verb@wrapPYG@PYG@two\else
\let\next=\spx@verb@wrapPYG@PYG@one
\fi\fi\fi
\next
}%
\long\def\spx@verb@wrapPYG@PYG@one#1{%
\expandafter\PYG\spx@verb@wrapPYG@PYG@spec{#1}%
\futurelet\spx@nexttoken\spx@verb@wrapPYG@PYG@i
}%
\long\def\spx@verb@wrapPYG@PYG@two#1#2{%
\expandafter\PYG\spx@verb@wrapPYG@PYG@spec{#1#2}%
\futurelet\spx@nexttoken\spx@verb@wrapPYG@PYG@i
}%
\long\def\spx@verb@wrapPYG@PYG@three#1#2#3{%
\expandafter\PYG\spx@verb@wrapPYG@PYG@spec{#1#2#3}%
\futurelet\spx@nexttoken\spx@verb@wrapPYG@PYG@i
}%
\long\def\spx@verb@wrapPYG@PYG@four#1#2#3#4{%
\expandafter\PYG\spx@verb@wrapPYG@PYG@spec{#1#2#3#4}%
\futurelet\spx@nexttoken\spx@verb@wrapPYG@PYG@i
}%
\catcode`Z 11%
%
\g@addto@macro\FV@SetupFont{%
\sbox\sphinxcontinuationbox {\spx@opt@verbatimcontinued}%
\sbox\sphinxvisiblespacebox {\spx@opt@verbatimvisiblespace}%
Expand Down