Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transliterator is missing compiled data constructors #4890

Open
kornelski opened this issue May 13, 2024 · 4 comments
Open

Transliterator is missing compiled data constructors #4890

kornelski opened this issue May 13, 2024 · 4 comments
Labels
blocked A dependency must be resolved before this is actionable C-transliterator Component: transliterator T-bug Type: Bad behavior, security, privacy

Comments

@kornelski
Copy link

The icu_transliterate crate has constructor methods that take:

P: DataProvider<TransliteratorRulesV1Marker> + DataProvider<CanonicalDecompositionDataV1Marker> + DataProvider<CompatibilityDecompositionSupplementV1Marker> + DataProvider<CanonicalDecompositionTablesV1Marker> + DataProvider<CompatibilityDecompositionTablesV1Marker> + DataProvider<CanonicalCompositionsV1Marker> + ?Sized,

This is a large complex-looking trait bound, which does not help me figure out what the actual type is needed here. Even when I try to browse the mentioned types like TransliteratorRulesV1Marker, it leads to more levels of abstraction, which all end at some HelloWorldProvider.

As a result, this API is so opaque and overly abstract, that I can't even tell if the transliterator is working at all, or is it a defunct stub that has no real implementations.

Some methods have shockingly large trait bounds. To me this is very discouraging, because I'm afraid that if I run into even slightest problems with this crate, I'll be faced with such wall of abstractions upon abstractions, requiring me to understand an abstraction with over 60 degrees of freedom, which seems absolutely an overkill when there's probably between 0 to 2 actual implementations of it.

@sffc
Copy link
Member

sffc commented May 13, 2024

Hi @kornelski!

Most functions with the large trait bounds have a link to this page which explains what they mean:

image

These bounds are what enable ICU4X to slice the data to exactly what is needed and avoid having a big monolithic data file as is required by ICU4C that doesn't have these bounds.

In general you should be using functions like RuleCollection::as_provider instead of RuleCollection::as_provider_unstable. Once you get things working with RuleCollection::as_provider, if you want to start building your own data, you can switch over to RuleCollection::as_provider_unstable, and the argument should be a reference to your custom data provider created as explained in this tutorial.

@sffc sffc closed this as not planned Won't fix, can't repro, duplicate, stale May 13, 2024
@sffc sffc added the question Unresolved questions; type unclear label May 13, 2024
@sffc
Copy link
Member

sffc commented May 13, 2024

Actually I'll re-open this issue specifically for Transliterator because it looks like it doesn't currently have a compiled_data constructor.

@sffc sffc reopened this May 13, 2024
@sffc sffc added C-transliterator Component: transliterator and removed question Unresolved questions; type unclear labels May 13, 2024
@sffc
Copy link
Member

sffc commented May 13, 2024

Partial fix for the Transliterator constructor in #4898

In the short term, you can get the type you need, the one that implements all the trait bounds, like this:

include!("tests/transliterate/data/baked/mod.rs");

where tests is the icu_experimental crate root which is components/experimental/tests in this repository. That file exposes a type named BakedDataProvider which is what you need.

It's not ideal that this is in a test directory, which is why I put up #4898. I think this is just an oversight from when we moved the code into the new icu_experimental crate.

@sffc sffc changed the title Incredibly large trait bounds Add compiled data constructors for Transliterator May 13, 2024
@sffc sffc changed the title Add compiled data constructors for Transliterator Transliterator is missing compiled data constructors May 13, 2024
@robertbastian
Copy link
Member

tests/transliterate/data/baked contains some hand written tests, it does not contain CLDR data.

@sffc sffc added the blocked A dependency must be resolved before this is actionable label Jul 24, 2024
@sffc sffc added this to the 2.x Priority ⟨P2⟩ milestone Jul 24, 2024
@sffc sffc added the T-bug Type: Bad behavior, security, privacy label Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked A dependency must be resolved before this is actionable C-transliterator Component: transliterator T-bug Type: Bad behavior, security, privacy
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants