Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible compatibility problem in 14.3.5 Bidirectional text #5594

Open
kojiishi opened this issue Jun 1, 2020 · 18 comments
Open

Possible compatibility problem in 14.3.5 Bidirectional text #5594

kojiishi opened this issue Jun 1, 2020 · 18 comments
Labels
compat Standard is not web compatible or proprietary feature needs standardizing i18n-alreq Notifies Arabic script experts of relevant issues i18n-hlreq Notifies Hebrew script experts of relevant issues i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. interop Implementations are not interoperable with each other topic: rendering

Comments

@kojiishi
Copy link

kojiishi commented Jun 1, 2020

One of Blink contributors started looking into Issue 296863: Make unicode-bidi:isolate the default for block elements instead of unicode-bidi:embed. This is about implementing 14.3.5 Bidrectional text.

Note, Issue 391260: Make unicode-bidi:isolate the default for an element with a dir attribute (instead of unicode-bidi:embed) has been implemented and is shipping for a while. This is about applying unicode-bidi: isolate to all block elements.

By reviewing the test failures from the early patch, I noticed a possible problem in fast/css/textCapitalizeEdgeCases.html:

<div style="text-transform: capitalize">a w<div style="display: inline">or</div>d</div>

With this change, <div style="display: inline"> will have unicode-bidi: isolate, and therefore it prevents kerning between "w" and "o".

Our date picker for Arabic is also broken by this change (please see fast/forms/date/date-appearance-l10n.html for example. Note, I'm not sure if this is a break or a fix, only noticed a change.) This is our UA shadow DOM, so we can fix it, but this indicates possible breaks for existing bidi contents.

I guess these side effects are rather minor, but IIUC the benefit is also not as large as applying unicode-bidi: isolate to elements with dir attributes. With these side effects in mind, can I confirm if all browsers want to apply this change? If someone filed a bug, is it correct to say <div style="display: inline"> should render differently from <span> and that authors should fix their content, even when it has nothing to do with bidi?

Sorry if these side effects were known and/or discussed before.

/cc @r12a @upsuper @jfkthame @litherum @lilles

@upsuper
Copy link
Member

upsuper commented Jun 1, 2020

I think Gecko has shipped this since 3yrs ago, and before that, we've being using -moz-isolate for another 5yrs, but that may not fully match the behavior of isolate.

I don't think I've seen any bug report related to that since then, so I assume that it is web compatible in general.

@kojiishi
Copy link
Author

kojiishi commented Jun 1, 2020

Test: https://jsbin.com/faciloy/edit?html,output

Firefox does not apply kerning at the <div style="display: inline"> boundary, but it does apply at the <span> boundary. Is this behavior desired?

@kojiishi
Copy link
Author

kojiishi commented Jun 1, 2020

I guess this goes to a question that, should div with display: inline imply a separated block context from adjacent characters, and its shaping/layout should not use adjacent characters as context?

As @upsuper, this is probably rare that I guess this is more about desired semantics than compat?

@jfkthame
Copy link

jfkthame commented Jun 1, 2020

I agree it's somewhat unexpected that <div style="display: inline"> behaves differently than <span> here, but it follows from the fact that 14.3.5 lists specific HTML tags to which isolate is applied by default, rather than doing something like making the unicode-bidi value dependent on the display property.

This means that if an author wants to make one of the HTML elements listed there behave just like an inline <span> in terms of merging into the surrounding text run, they'll need to explicitly reset its unicode-bidi property; it's not sufficient to just change display.

So I believe the Firefox behavior is the expected result of the HTML spec. From an authoring point of view, it'd arguably be better if the application of unicode-bidi: isolate were instead based on the computed value of display (rather than simply on the HTML tag names), but I don't think this edge-case issue justifies the extra spec and implementation complexity that would probably involve.

@upsuper
Copy link
Member

upsuper commented Jun 1, 2020

As far as unicode-bidi: isolate consistently breaks up text run (that <div style="display: inline"> behaves the same way as <span style="unicode-bidi: isolate">), that sounds acceptable.

Whether kerning should be applied on text across unicode-bidi: isolate boundary value, I'm not exactly sure. It seems to me that Gecko always breaks text run when the bidi embedding level changes, which I think makes sense, because text with different bidi embedding level can potentially get reordered, and indeed unicode-bidi: isolate changes the level.

Maybe ideally we should always apply kerning post bidi reorder, but that can be problematic as kerning affects layout...

@jfkthame
Copy link

jfkthame commented Jun 1, 2020

ISTM that unicode-bidi: isolate is expressing a clear boundary, such that the content within the isolate element is independent of its context. So I wouldn't expect text shaping effects (such as kerning, but also contextual forms, ligatures, etc) that apply to runs of characters to get applied across that boundary.

@domenic domenic added i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. compat Standard is not web compatible or proprietary feature needs standardizing interop Implementations are not interoperable with each other topic: rendering labels Jun 1, 2020
@noamr
Copy link
Contributor

noamr commented Jun 1, 2020

webkit bug: https://bugs.webkit.org/show_bug.cgi?id=65617

I'm currently contributing this on behalf of Wikimedia, both to webkit and to chromium, trying to achieve better alignment between the different browsers around bidi... Note that the current situation with unicode-bidi: isolate default is as such:

  • With dir attribute: supported in Firefox & Chrome, not in Safari
  • For display: block-by-default elements: supported in Firefox, not in Safari and Chrome.

It's best if the browser engines reach interoperability here, and I think how the spec defines it is a reasonable approach - divs and pres and their other block-ish siblings break bidi chains, and unicode-bidi is the way to change that rather than display.

@xfq xfq added i18n-alreq Notifies Arabic script experts of relevant issues i18n-hlreq Notifies Hebrew script experts of relevant issues labels Jun 2, 2020
@ebraminio
Copy link

ebraminio commented Jun 2, 2020

but IIUC the benefit is also not as large as applying unicode-bidi: isolate to elements with dir attributes

Yes, it is beneficial for users who deal with bidi, see end of https://commons.wikimedia.org/

image

I had to wrap content of every li element with bdi in order to prevent the bidi issue, something I liked to push Chromium for so I can drop the hack. https://chromium-review.googlesource.com/c/chromium/src/+/1081302 in Wikipedia there is some kind of list called hlist which immediately can benefit from this, yes, we can add such CSS ourselves, perhaps in every MediaWiki instance or MediaWiki itself, yet I wonder why not implement some good and beneficial standard that Mozilla has implemented for years.

@r12a
Copy link

r12a commented Jun 3, 2020

Iirc, there is a similar effect on certain aspects of cursive joining where block elements have been made inline (which cannot be modified by use of CSS). I seem to vaguely remember that preservation of the separation around a block element that has been made inline was intended as a feature, since block elements normally contain separated items. So i suspect that the case where kerning fails arises from some questionable approach to markup rather than from a fault in the way the browsers currently handle things, and i wonder whether anyone would actually want that kerning to happen outside of the test that was mentioned(?) On the other hand, @ebraminio appears to be giving us good reasons for maintaining isolation for these block items when concatenating them inline. This is a common scenario for lists that mix languages, because you don't want to introduce interference effects as text direction changes.

@kojiishi
Copy link
Author

kojiishi commented Jun 7, 2020

Thank you all for the feedback. IIUC:

  • Mozilla is supportive to make these elements to isolate font shaping.
  • Mozilla is shipping the behavior for 3 years.
  • WebKit doesn't support ligature across elements yet, so it's isolated anyway.
  • @noramr is interested in contributing to both Blink and WebKit.
  • @ebraminio and @r12a confirmed this is useful for bidi.

I don't have strong opinions either way, but I'd like us to be interoperable, and it looks like we will be.

crbug.com/296863 says:

When a block element is set to display:inline, it is desirable for it to behave bidi-wise as it would if it were display:inline-block.

but now we're leaning towards to say "not only for bidi-wise, but also for font shaping purposes, these elements are considered as isolated from the surrounding context".

Do we want to clarify this somewhere in the spec, or do we have such statements already?

@kojiishi
Copy link
Author

@domenic Could you advise if this should need a change to the spec? If we agree, no changes to bidi section, but the behavior for non-bidi was not understood well before. I'm wondering whether the effect on shaping including non-bidi ligature/kerning should be stated somewhere or not.

@litherum Are you happy with this?

@jfkthame
Copy link

FWIW, the CSS Text spec makes this explicit:

Text shaping must be broken at inline box boundaries when any of the following are true [...]

  • The boundary is a bidi isolation boundary.

@ebraminio
Copy link

ebraminio commented Jun 17, 2020

As a supporting case for shaping break on such elements, Mozilla once was doing cross element shaping for MediaWiki and the result was something like this

image

(I just made this up by adding zwj as I can't reproduce this by current browsers and am just remembering this was like this once)

And the intended was,

image

(and just to mention the tabs are inlined <li>, done for SEO or whatever reason in MediaWiki)

MediaWiki of course could insert arounding space or anything to mitigate this, like what I remember now did for MediaWiki on a very similar case for a <li> element, https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/LiquidThreads/+/91828/ however while we have this nice idea in the spec and implemented in Mozilla for years guess we can remove the need for such patches everywhere.

@noamr
Copy link
Contributor

noamr commented Jun 17, 2020

but IIUC the benefit is also not as large as applying unicode-bidi: isolate to elements with dir attributes

Yes, it is beneficial for users who deal with bidi, see end of https://commons.wikimedia.org/

image

I had to wrap content of every li element with bdi in order to prevent the bidi issue, something I liked to push Chromium for so I can drop the hack.

On a side note, trying to understand why not add the "correct" user-agent stylesheet in your reset as a workaround?

In this case something like li { unicode-bidi: isolate }

(I think that this should be in the user-agent stylesheet, but the reset-css workaround also seems simple)

@ebraminio
Copy link

ebraminio commented Jun 17, 2020

On a side note, trying to understand why not add the "correct" user-agent stylesheet in your reset as a workaround?

Sure! We can add that as a template css, or MediaWiki:Common.css or in MediaWiki itself (as the previous case shows it is beneficial for several other places in MediaWiki also), the apply of this specific workaround is for years ago maybe before knowing it was going to happen in UA stylesheet.

@noamr
Copy link
Contributor

noamr commented Jun 17, 2020

On a side note, trying to understand why not add the "correct" user-agent stylesheet in your reset as a workaround?

Sure! We can add that as a template css, or MediaWiki:Common.css or in MediaWiki itself (as the previous case shows it is beneficial for several other places in MediaWiki also), the apply of this specific workaround is for years ago maybe before knowing it was going to happen in UA stylesheet.

That stylesheet is in the spec itself! https://html.spec.whatwg.org/multipage/rendering.html#bidi-rendering

I suggest to put that entire clause or the parts that matter in one of your common stylesheets (and in the meantime we'll push for browsers to be interoperable there)

@ebraminio
Copy link

ebraminio commented Jun 17, 2020

That stylesheet is in the spec itself! https://html.spec.whatwg.org/multipage/rendering.html#bidi-rendering

That's why I am pushing for this.

I suggest to put that entire clause or the parts that matter in one of your common stylesheets (and in the meantime we'll push for browsers to be interoperable there)

I am ok with putting it in MediaWiki base CSS of course, I remember there were a time (6 7 years ago maybe) that no one specific CSS source of MediaWiki was included in every theme and that's changed now I guess. In the meanwhile using patches we've done here and there we are good now in MediaWiki most likely, yet a browser change will be nice I believe.

@domenic
Copy link
Member

domenic commented Jun 17, 2020

Could you advise if this should need a change to the spec? If we agree, no changes to bidi section, but the behavior for non-bidi was not understood well before. I'm wondering whether the

I'd like to defer this decision to you all. If it would be helpful for you as implementers and domain experts, then it should be included, either here or in CSS (maybe it is already in CSS?). And if it belongs here, then I'm happy to provide editorial review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compat Standard is not web compatible or proprietary feature needs standardizing i18n-alreq Notifies Arabic script experts of relevant issues i18n-hlreq Notifies Hebrew script experts of relevant issues i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. interop Implementations are not interoperable with each other topic: rendering
Development

No branches or pull requests

8 participants