Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transformer for RelativeTimeFormat component #2822

Merged
merged 9 commits into from
Nov 28, 2022

Conversation

pdogr
Copy link
Contributor

@pdogr pdogr commented Nov 8, 2022

No description provided.

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work so far!

Cargo.toml Outdated Show resolved Hide resolved
experimental/relativetime/src/provider.rs Outdated Show resolved Hide resolved
experimental/relativetime/src/provider.rs Outdated Show resolved Hide resolved
experimental/relativetime/src/provider.rs Outdated Show resolved Hide resolved
pub display_name: Option<Cow<'data, str>>,
/// Mapping for relative time fields.
#[cfg_attr(feature = "serde", serde(borrow))]
pub relatives: ZeroMap<'data, i8, str>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: What is the full set of number offsets? In most locales it is -1, 0, 1; there are some that go to -2, -1, 0, 1, 2. This is an okay use of ZeroMap, but if there is never going to be more than -2 to 2, we may want to consider expanding that into the struct itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The offsets are from [-2, 2].

experimental/relativetime/src/provider.rs Show resolved Hide resolved
/// Relative time patterns data struct.
#[icu_provider::data_struct(
StandardRelativeTimeV1Marker = "relativetime/standard@1",
NarrowRelativeTimeV1Marker = "relativetime/narrow@1",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: I'm wondering whether we want to slice this smaller. For example, I know ECMA-402 delays the unit choice to the format function, but should we be making it an option in the constructor instead? And how about the direction (past/future)? We're going to get some pretty big data keys out of this, which begs the question about whether we should be slicing more granularly.

It's OK to make constructors that load multiple data keys for convenience, of course.

One possible axis that is already in ECMA-402 is "numeric". If the user requests numeric: "always", then they shouldn't need to load the non-numeric data. So, we could maybe move that data into a separate data key.

Copy link
Contributor Author

@pdogr pdogr Nov 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data keys are pretty big currently with most of the fields being 'None'.

I'm in favour of splitting it further into at least | "unit" x "width" | data keys. This would remove the need for having an array of "RelativeTimePattern".

Then we can focus on an efficient representation of "PluralRulesCategory" => "Singular substitution pattern" mapping.
Using a "ZeroMap<PluralRulesCategory, SubstitutionPattern>" would be good as we don't store the unnecessary mappings but I haven't been able to make it work.

Copy link
Member

@sffc sffc Nov 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to do unit x width (8*3). What do you think about unit x width x numeric (8*3*2)?

Yes, let's brainstorm the best representation for PluralCategory => Pattern.

There are some advantages to expanding to separate fields instead of using ZeroMap. Note that None fields do not take up much space in Postcard.

@pdogr pdogr marked this pull request as ready for review November 10, 2022 15:14
@pdogr pdogr changed the title WIP for RelativeTimeFormat component Transformer for RelativeTimeFormat component Nov 10, 2022
@sffc sffc self-requested a review November 10, 2022 19:32
experimental/relativetime/src/provider.rs Outdated Show resolved Hide resolved
experimental/relativetime/src/provider.rs Outdated Show resolved Hide resolved
experimental/relativetime/src/provider.rs Outdated Show resolved Hide resolved
experimental/relativetime/src/provider.rs Outdated Show resolved Hide resolved
experimental/relativetime/src/provider.rs Show resolved Hide resolved
experimental/relativetime/src/provider.rs Outdated Show resolved Hide resolved
provider/datagen/src/transform/cldr/source.rs Outdated Show resolved Hide resolved
@Manishearth
Copy link
Member

Overall looks pretty great!

experimental/relativetime/src/provider.rs Outdated Show resolved Hide resolved
experimental/relativetime/src/provider.rs Outdated Show resolved Hide resolved
experimental/relativetime/src/provider.rs Outdated Show resolved Hide resolved
- Remove "{0}" from substitution pattern string
- Clean code in cldr_serde/date_fields.rs
- Improve docs for relativetime/src/provider.rs
sffc
sffc previously approved these changes Nov 14, 2022
@pdogr
Copy link
Contributor Author

pdogr commented Nov 27, 2022

Updated the postcard data after merging.

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM

Comment on lines +991 to +993
relativetime/narrow/hour@1, en, 50B, 2a119b1d4dfa007dbc4d2e4d79cf39bd23b39167399b5878fea8cd6e7269fb5b
relativetime/narrow/hour@1, en-001, 48B, f15555b8ed2135cb792b46db2aa2b769423fc763fc8bb348ce58a7eff542ee43
relativetime/narrow/hour@1, en-ZA, 48B, f15555b8ed2135cb792b46db2aa2b769423fc763fc8bb348ce58a7eff542ee43
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Praise: Note how for narrow hour, en differs from en-001. Since these are small keys, only the difference for narrow hour needs to be stored, and we can keep deduplicating the rest of the time scales that are identical. With a bigger key, a single change like this would "spoil" the deduplication from working most optimally.

Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also still lgtm

@Manishearth Manishearth merged commit d2b9a10 into unicode-org:main Nov 28, 2022
@Manishearth
Copy link
Member

This is great, thanks!

@pdogr pdogr deleted the relativetime branch November 28, 2022 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants