Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect glyph order with Bengali #84

Open
floppyhammer opened this issue Jan 10, 2023 · 4 comments
Open

Incorrect glyph order with Bengali #84

floppyhammer opened this issue Jan 10, 2023 · 4 comments

Comments

@floppyhammer
Copy link

Test text:
ওহে বিশ্ব!

Glyph indexes (allsorts):
1497 1530 1539 3 1521 1533 1527 1543 1521 4

image

Glyph indexes (rustybuzz):
1497 1539 1530 3 1533 1521 1527 1543 1521 4

image

We can see there's a bit off with allsorts.

Used config:

let script = tag::BENG;
let dir = TextDirection::LeftToRight;
let lang = Some(tag::BENG);
@mikeday
Copy link
Contributor

mikeday commented Jan 10, 2023

Which font?

@floppyhammer
Copy link
Author

floppyhammer commented Jan 10, 2023

Arial Unicode MS Font.ttf

I have also tested with NotoSansBengali-Regular.ttf, and the result is correct. It seems the issue is font related.

@wezm
Copy link
Contributor

wezm commented Jan 11, 2023

It looks like the font doesn't have gsub rules for Bengali. Rustybuzz must be picking/falling back to a different script.

allsorts layout-features Arial\ Unicode\ MS\ Font.ttf    
Table: GSUB
  Script: arab
    Language: default
      Feature: isol
        Lookups: 0
      Feature: init
        Lookups: 1
      Feature: medi
        Lookups: 2
      Feature: fina
        Lookups: 3
      Feature: liga
        Lookups: 4,5,6
    Language: FAR 
      Feature: isol
        Lookups: 0
      Feature: init
        Lookups: 1
      Feature: medi
        Lookups: 2
      Feature: fina
        Lookups: 3
      Feature: liga
        Lookups: 4,5,6
      Feature: isol
        Lookups: 7
      Feature: fina
        Lookups: 8
      Feature: locl
        Lookups: 9
    Language: URD 
      Feature: isol
        Lookups: 0
      Feature: init
        Lookups: 1
      Feature: medi
        Lookups: 2
      Feature: fina
        Lookups: 3
      Feature: liga
        Lookups: 4,5,6
      Feature: isol
        Lookups: 10
      Feature: init
        Lookups: 11
      Feature: medi
        Lookups: 12
      Feature: fina
        Lookups: 13
      Feature: locl
        Lookups: 14
  Script: deva
    Language: default
      Feature: nukt
        Lookups: 15
      Feature: akhn
        Lookups: 16
      Feature: rphf
        Lookups: 17
      Feature: blwf
        Lookups: 18
      Feature: half
        Lookups: 19
      Feature: vatu
        Lookups: 20,21
      Feature: pres
        Lookups: 22,23,24,26
      Feature: abvs
        Lookups: 29,30,31,32
      Feature: blws
        Lookups: 39,40,42
      Feature: psts
        Lookups: 44
      Feature: haln
        Lookups: 46
  Script: gujr
    Language: default
      Feature: nukt
        Lookups: 59
      Feature: akhn
        Lookups: 60
      Feature: rphf
        Lookups: 61
      Feature: blwf
        Lookups: 62
      Feature: half
        Lookups: 63
      Feature: vatu
        Lookups: 64,65
      Feature: pres
        Lookups: 66,67,68,70,72
      Feature: abvs
        Lookups: 74,75,76,81
      Feature: blws
        Lookups: 84
      Feature: psts
        Lookups: 85
      Feature: haln
        Lookups: 86
  Script: guru
    Language: default
      Feature: nukt
        Lookups: 47
      Feature: blwf
        Lookups: 48
      Feature: half
        Lookups: 49
      Feature: pstf
        Lookups: 50
      Feature: blws
        Lookups: 51,55
      Feature: abvs
        Lookups: 56,57
  Script: hani
    Language: default
      Feature: salt
        Lookups: 108
      Feature: trad
        Lookups: 109
      Feature: smpl
        Lookups: 110
      Feature: vert
        Lookups: 111
    Language: JAN 
      Feature: vert
        Lookups: 111
    Language: KOR 
      Feature: locl
        Lookups: 108
      Feature: vert
        Lookups: 111
    Language: ZHS 
      Feature: locl
        Lookups: 110
      Feature: vert
        Lookups: 111
    Language: ZHT 
      Feature: locl
        Lookups: 109
      Feature: vert
        Lookups: 111
  Script: kana
    Language: default
      Feature: vert
        Lookups: 111
    Language: JAN 
      Feature: vert
        Lookups: 111
  Script: knda
    Language: default
      Feature: akhn
        Lookups: 94
      Feature: rphf
        Lookups: 95
      Feature: blwf
        Lookups: 96
      Feature: half
        Lookups: 97
      Feature: blws
        Lookups: 98
      Feature: abvs
        Lookups: 99
      Feature: psts
        Lookups: 102,104,105,106
      Feature: haln
        Lookups: 97
  Script: taml
    Language: default
      Feature: akhn
        Lookups: 87
      Feature: half
        Lookups: 88
      Feature: abvs
        Lookups: 89,90
      Feature: psts
        Lookups: 91,92
      Feature: haln
        Lookups: 88
Table: GPOS
  Script: arab
    Language: default
      Feature: mark
        Lookups: 21,22,23,24
  Script: deva
    Language: default
      Feature: abvm
        Lookups: 0
      Feature: blwm
        Lookups: 1
      Feature: dist
        Lookups: 2
  Script: gujr
    Language: default
      Feature: abvm
        Lookups: 8
      Feature: blwm
        Lookups: 9
      Feature: dist
        Lookups: 10
  Script: guru
    Language: default
      Feature: abvm
        Lookups: 6
      Feature: blwm
        Lookups: 7
  Script: knda
    Language: default
      Feature: dist
        Lookups: 15,17,19
  Script: taml
    Language: default

@adrianwong
Copy link
Member

It appears that the differences come down to the left matras:

  • ি U+09BF BENGALI VOWEL SIGN I, and
  • ে U+09C7 BENGALI VOWEL SIGN E

being ordered incorrectly in the output.

I've been away from shaping code for a while now, but my suspicion is that HarfBuzz and/or rustybuzz perform some of the initial reordering required for shaping Indic text, whereas Allsorts bails early if the font's GSUB table doesn't contain the expected script:

allsorts/src/scripts/indic.rs

Lines 1106 to 1112 in 1d05ffa

let (script_tag, shaping_model, script_table) = match gsub_table.find_script(indic2_tag)? {
Some(script_table) => (indic2_tag, ShapingModel::Indic2, script_table),
None => match gsub_table.find_script_or_default(indic1_tag)? {
Some(script_table) => (indic1_tag, ShapingModel::Indic1, script_table),
None => return Ok(()),
},
};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants