Skip to content

[browser] HybridGlobalization allows different hashes for strings that return true for Equals #96400

Closed
@ilonatommy

Description

@ilonatommy

We do not have a native, locale sensitive algorithm for HybridGlobalization, that would replace the ICU data. Implementing it manually in JS would slow down hashing mechanism. From this reason, we decided to use Invariant hasing and sort key functions in HybridGlobalization mode (see: #96354).
This approach has a few downsides, including a situation where two strings, e.g. s1 = "igloo", s2 = "İGLOO"
image
that are equal under "tr-TR" linguistic comparer (ICU or HybridGlobalization), produce:

bool linguisticComparer = new CultureInfo("tr-TR").CompareInfo.Compare(s1, s2, CompareOptions.IgnoreCase); // true

The HasCodes in ICU mode and HybridGlobalization mode are produced differently:

int hashCode1 = new CultureInfo("tr-TR").CompareInfo.GetHashCode(s1, CompareOptions.IgnoreCase);
int hashCode2 = new CultureInfo("tr-TR").CompareInfo.GetHashCode(s2, CompareOptions.IgnoreCase);
// true when ICU loaded (Hybrid mode off), false when HybridGlobalization mode on (reduced ICU loaded)

Equality function should have the same effect as GetHashCode comparison, however for HybridGlobalization this does not stand.

There are HashCodeLocalized test cases excluded for HG marked with this issue.

Possible fix:

  • switching HybridGlobalization off (this will increase the application size by loading additional ICU data)
  • implementing JS-based, locale-sensitive hashing (this will be much slower than Invariant or ICU hashing).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions