Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Line breaks for inline formulas #1287

Merged
merged 25 commits into from
May 10, 2018
Merged

Conversation

edemaine
Copy link
Member

@edemaine edemaine commented May 4, 2018

Allow for line breaks in inline formulas according to TeX's rules. Display formulas are not affected. Part of #327, specifically the plan suggested in #327 (comment)

The TeXBook [p.173] says "A formula will be broken only after a relation symbol like $=$ or $<$ or $\rightarrow$, or after a binary operation symbol like $+$ or $-$ or $\times$, where the relation or binary operation is on the ``outer level'' of the formula (i.e., not enclosed in {...} and not part of an \over construction)."

In inline math, we break the outer katex-html into multiple base spans, splitting wherever a line break can be (in particular, right after top-level mop or mbin). Each base span gets its own strut, which handles vertical spacing between lines properly. This was easier than I thought it'd be!

In addition, I added support for \allowbreak and \nobreak which modify this behavior. By also marking them as (zero-width) spaces, I believe they don't affect anything else about layout.

Examples from the test server with display=0 set:

image

image

image

image

image

image

@codecov
Copy link

codecov bot commented May 4, 2018

Codecov Report

Merging #1287 into master will increase coverage by 0.06%.
The diff coverage is 94.59%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1287      +/-   ##
==========================================
+ Coverage    83.8%   83.87%   +0.06%     
==========================================
  Files          61       61              
  Lines        3990     4018      +28     
  Branches      666      671       +5     
==========================================
+ Hits         3344     3370      +26     
- Misses        547      549       +2     
  Partials       99       99
Impacted Files Coverage Δ
src/buildCommon.js 89.36% <100%> (ø) ⬆️
src/macros.js 85% <100%> (+0.18%) ⬆️
src/symbols.js 100% <100%> (ø) ⬆️
src/domTree.js 62.11% <75%> (+0.23%) ⬆️
src/buildHTML.js 96.62% <96.29%> (-0.15%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 34e6458...60f8a8d. Read the comment docs.

Copy link
Contributor

@sophiebits sophiebits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super exciting.

@@ -32,13 +36,6 @@
// mode, while still allowing background/foreground setting on root .katex
* { -ms-high-contrast-adjust: none !important; }

.katex-html {
// Making .katex inline-block allows line breaks before and after,
// which is undesireable ("to $x$,"). Instead, adjust the .katex-html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you able to verify that in to $x$, it's impossible to have a break between the x and comma? In a narrow container it should break before x instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, in fact this is #1233. I've left this comment as is, though it clearly is incorrect, and I can edit it. This PR has no effect on this issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. I swear this used to work.

@ronkok
Copy link
Collaborator

ronkok commented May 4, 2018

Very, very cool stuff. I do have some questions.

Will \\ be available as a mandatory line break?

In LaTeX, there are some nuances about where line breaks can occur. Some examples:

Item Line break allowed
skips yes
kerns no
\, or ~ no
Inside boxes, including \hbox, \raisebox, or \colorbox no

Do you have any plan in mind for incorporating these nuances? Is it a matter of going through the existing code and setting an allowbreak flag or nobreak flag?

@ronkok
Copy link
Collaborator

ronkok commented May 4, 2018

We'll have to develop some more sophisticated code for spaces. Directly after a line break, an \hspace should be eaten but an \hspace* should render its space. Our code will need some work in order to make that happen.

@edemaine
Copy link
Member Author

edemaine commented May 5, 2018

@ronkok Thanks for the additional rules! I'm not convinced we can get spaces to get eaten by line breaks, using just CSS... I'll have to think about that.

But the other rules you mention should be easy; we'd just need to give hskip's a CSS class of allowbreak. (Everything else is already forbidden.)

\\ should also be doable, but probably not without more work. Maybe a separate PR.

@edemaine
Copy link
Member Author

edemaine commented May 5, 2018

@ronkok Actually, I don't think those rules apply here. We're dealing with line breaks within math mode. I just did some tests, and none of \ , \quad, \hspace, \hskip enable line breaks within math mode. For example:

$this\ is\ some\ long\ text\quad
 this\ is\ some\ long\ text\quad
 this\ is\ some\ long\ text\quad
 ...
$

However, if I add + before each \quad, I do observe that that space gets eaten when there's a line break just before it.

@ronkok
Copy link
Collaborator

ronkok commented May 5, 2018

@edemaine You're right. I also did some testing and I can confirm that line breaks are not prevented in math mode by \,, \ , \space, or by kerns. That's terrific. It makes your work even better.

~ does prevent a line break in math mode. So does \nobreakspace.

Also, boxes, e.g. \raisebox prevent any line breaks. Inside a box, even \\ will fail to produce a line break. With boxes, can we do something with CSS? Something like a nobreak !important rule?

@edemaine
Copy link
Member Author

edemaine commented May 5, 2018

@ronkok I hope you mean "line breaks are forbidden at whitespace". The example I gave above does not get line breaks in LaTeX (or in this PR). Line breaks only seem to be allowed at \allowbreak and after bins and ops. In (inline) math, ~ and \ are treated identically, so far as I can see.

This PR doesn't allow line breaks in any nested objects, which includes boxes. It only looks for breakable things at the top level. Same as TeX. \\ could similarly be dealt with at the top level only, presumably via a <br> tag.

@ronkok
Copy link
Collaborator

ronkok commented May 5, 2018

That's odd. I get no line breaks when I try:

\(\aaaaaaaaaaaaaaaaaa+~aaaaaaaaaaaaaaaaaaaaaa+~aaaaaaaaaaaaaaaaaaaa+~aaaaaaaaaaaaaaaaaaaaaaa+~aaaaaaaaaaaaaa \)

It seems that the ~ directly after a bin prevents the line break that would otherwise occur.

I'm glad to hear about the boxes.

@edemaine
Copy link
Member Author

edemaine commented May 5, 2018

Oh, I see our/my confusion. I meant that spaces/glue by themselves do not permit line breaks. You're saying that a nonbreaking space immediately after a valid line break place prevents that line break. Got it! That should be easy to fix.

@ronkok
Copy link
Collaborator

ronkok commented May 5, 2018

@edemaine In your most recent commit, you ensured that glue will stay with the operator. Could you make an exception for \hspace*? It exists to provide a space at the start of a line after a line break.

@edemaine
Copy link
Member Author

edemaine commented May 5, 2018

@ronkok Good idea. I'd like to resolve #1289 first though; it's pretty confusing.

In the latest commit, I changed ~ and \nobreakspace to also have nonbreaking behavior.

@edemaine
Copy link
Member Author

edemaine commented May 8, 2018

@ronkok I read the source code for \hspace* and was able to implement it almost exactly like LaTeX (!). Our macro system has come a long way... LaTeX uses a \rule to protect the space from the line-breaking algorithm, so I did the same.

Demo:

hspace*

hspace

I also fixed a bug that allowed line breaks where I didn't want them (needed to set white-space: nowrap at some level, namely, .base).

@ronkok
Copy link
Collaborator

ronkok commented May 8, 2018

Cool! This is really coming together well.

@edemaine
Copy link
Member Author

edemaine commented May 8, 2018

I've successfully implemented \\, at least for inline math (where LaTeX supports it), on top of this. For simplicity of review, I've split it off into a separate PR, #1298. That can be reviewed after this PR.

@edemaine
Copy link
Member Author

edemaine commented May 10, 2018

I tracked down the bulk of the screenshot differences to this CSS rule:

    .vlist-s {
        // This cell solves Safari rendering problems. It has text content, so
        // its baseline is used for the table. A very small font avoids line-box
        // issues; absolute units prevent user font-size overrides from breaking
        // rendering. Safari refuses to make the box zero-width, so we give it
        // a known width and compensate with negative right margin on the
        // inline-table.
        width: 2px;

The placement of width: min-content on the .base element instead of the .katex-html element ends up removing this space for some reason (in inspect mode, the .vlist-s element has a width of 0 instead of 2). Adding a rule of min-width: 2px fixes the problem.

Now all screenshot diffs (against master) appear entirely black, meaning the differences are all subpixel, as expected, except for ArrayMode and DisplayMode which still has a global horizontal shift as above. I'll keep investigating that one.

@edemaine
Copy link
Member Author

What luck! The horizontal offset in display math actually seems to be an improvement over past behavior.

Here's ArrayMode in master, with Chrome inspecting the .base element:

image

And here's the same test on this PR:

image

The new version has the correct bounding box on the .base element, resulting in correct centering. The old centering was shifted too far to the right. This is also pretty evident if you look at the screenshot images (e.g. in diff mode): old screenshot vs. new screenshot.

So I'm happy with all the screenshots now. I'll clean up the code according to the comments next.

@edemaine
Copy link
Member Author

@kevinbarabash I implemented all your comments on the code. This should be good to go now, except for probably we should add some multiline line-breaking tests.

@kevinbarabash
Copy link
Member

@edemaine thanks. This looks great. I'm so happy you were able to sort out the horizontal spacing differences. I'll do another pass this evening.

@edemaine
Copy link
Member Author

Added a screenshot test, which should test everything, including \nobreak, \allowbreak, and \hspace vs. \hspace*. While testing, I realized that I broke \nobreak when spaces became .mspace. Should be all set now.

Copy link
Member

@kevinbarabash kevinbarabash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. So excited to have this in the project!

@@ -205,7 +205,7 @@ export const getTypeOfDomTree = function(node, side = "right") {
// 'mtight' indicates that the node is script or scriptscript style.
export const isLeftTight = function(node) {
node = getOutermostNode(node, "left");
return utils.contains(node.classes, "mtight");
return node.hasClass("mtight");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yay!

* Combine an array of HTML DOM nodes (e.g., the output of `buildExpression`)
* into an unbreakable HTML node of class .base, with proper struts to
* guarantee correct vertical extent. `buildHTML` calls this repeatedly to
* make up the entire expression as a sequence of unbreakable units.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice comment.

@@ -654,10 +654,60 @@ export default function buildHTML(tree, options) {
// normal place) so we use an absolute value for vertical-align instead
bottomStrut.style.verticalAlign = -body.depth + "em";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize that you could pass verticalAlign a length. It makes me wonder why the topStrut is even necessary since the bottomStrut's height is already body.height + body.depth.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good question! I should experiment with removing the topStrut.

export default function buildHTML(tree, options) {
// buildExpression is destructive, so we need to make a clone
// of the incoming tree so that it isn't accidentally changed
tree = JSON.parse(JSON.stringify(tree));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize this. I wonder how we're mutating tree. 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious too! I haven't seen any, but this comment has been in the code for a long time.

i++;
parts.push(expression[i]);
if (expression[i].hasClass("nobreak")) {
nobreak = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot cleaner.

@kevinbarabash
Copy link
Member

Is anyone able able to view side-by-side images in the files tab on GitHub? All I see is Binary file not shown.. It's not just this diff either. :(

@kevinbarabash
Copy link
Member

I had a look at the screenshots. They look good.

@kevinbarabash kevinbarabash merged commit 523df29 into KaTeX:master May 10, 2018
@edemaine
Copy link
Member Author

@kevinbarabash Check out #1267 (comment) . I was playing with the various "rich diff" options and they're pretty cool! You can even overlay the two imaged in various ways (though not as good as our red/green diffs).

Thanks for the review! I excited to use this feature.

edemaine added a commit to edemaine/KaTeX that referenced this pull request May 11, 2018
As suggested in KaTeX#1287 (comment)
the two struts were redundant; the formerly "bottom" strut suffices to
implement both height and depth constraints.
edemaine added a commit that referenced this pull request May 11, 2018
* One strut instead of two

As suggested in #1287 (comment)
the two struts were redundant; the formerly "bottom" strut suffices to
implement both height and depth constraints.

* Update screenshots
@ylemkimon ylemkimon mentioned this pull request Aug 18, 2018
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants