Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PluralRules #119

Merged
merged 6 commits into from
Aug 12, 2020
Merged

Add PluralRules #119

merged 6 commits into from
Aug 12, 2020

Conversation

zbraniecki
Copy link
Member

@zbraniecki zbraniecki commented Jun 10, 2020

Fixes #116.

This is the initial WIP for the plural rules component.

I'm going to include the rules parser/ast/resolver for runtime resolution based on input string, but we could also compile AST into JSON if we wanted and then data provider could load an already parsed AST.
Finally, it should be possible to compose Rust functions out of the AST and store them as one of the optional data providers.

I didn't use Options constructor model because it seems like this struct will have really only two variables - locale and type.
Depending on how we'll hook in the DataProvider we may want to make the constructor fallible because it may not have locale/type for a given request.

I'm also not sure if I should keep the IntoOperands external to select or internal. In my original code I had it internal: https://github.com/zbraniecki/pluralrules/blob/master/intl_pluralrules/src/lib.rs#L140

but that means that select is fallible, which is not needed for usize etc. where the IntoOperands is not fallible.

Alternatively, we could have infallible select which accepts Into<Operands> and fallible try_select which accepts TryInto<Operands>, or even infallible select only, and then if someone wants to pass something fallible, they'll have to try_into explicitly before passing the result and handle the errors.

@sffc, @filmil - can you skim through and let me know if it looks like a good start?

@zbraniecki zbraniecki force-pushed the pluralrules branch 2 times, most recently from a30e352 to af8f0b1 Compare June 10, 2020 01:27
@coveralls
Copy link

coveralls commented Jun 10, 2020

Pull Request Test Coverage Report for Build 0fd58a5041a6d641108f81bbae60cb40c0680c8f-PR-119

  • 249 of 300 (83.0%) changed or added relevant lines in 6 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-1.5%) to 89.272%

Changes Missing Coverage Covered Lines Changed/Added Lines %
components/pluralrules/src/lib.rs 3 7 42.86%
components/pluralrules/src/rules/lexer.rs 79 88 89.77%
components/pluralrules/src/operands.rs 42 54 77.78%
components/pluralrules/src/rules/ast.rs 9 35 25.71%
Totals Coverage Status
Change from base Build 9a296e8accc5ca174c2cb0432beec34491c266ff: -1.5%
Covered Lines: 1398
Relevant Lines: 1566

💛 - Coveralls

@filmil filmil self-requested a review June 10, 2020 02:27
@filmil
Copy link
Contributor

filmil commented Jun 10, 2020

I'll get to this tomorrow.

@zbraniecki
Copy link
Member Author

The other thing I'm planning to do it to make Operands lazy. Many rules don't use most operand fields, so by making them lazy I think I can get better perf than I have in https://github.com/zbraniecki/pluralrules

@filmil
Copy link
Contributor

filmil commented Jun 10, 2020

Looking at this now.

filmil
filmil previously approved these changes Jun 10, 2020
Copy link
Contributor

@filmil filmil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have substantial remarks. All nits inside, or discussion items.

components/pluralrules/Cargo.toml Outdated Show resolved Hide resolved
components/pluralrules/src/data.rs Outdated Show resolved Hide resolved
components/pluralrules/src/lib.rs Show resolved Hide resolved
components/pluralrules/src/lib.rs Outdated Show resolved Hide resolved
components/pluralrules/src/operands.rs Show resolved Hide resolved
components/pluralrules/src/rules/ast.rs Outdated Show resolved Hide resolved
components/pluralrules/src/rules/resolver.rs Show resolved Hide resolved
impl TryFrom<$ty> for PluralOperands {
type Error = &'static str;
fn try_from(input: $ty) -> Result<Self, Self::Error> {
// XXXManishearth converting from i64 to isize may wrap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about keeping the more precise value, and down-converting at output time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is "output" time here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is, if the params kept the value of x in the widest type that would contain any of the values, we could avoid the need to make the wrapping decision here. However, since the fields n and i are public, we have to make that deliberation here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is x?
PluralOperands are only used in plural rules resolution. I struggle to see a case where we'd benefit from lazy resolving them later, because I don't see a use case where we'd construct a PluralOperands but not want to get it's i or n.
And if we do, we'll want it for many rules as we test against them. So having them eagerly resolved seems like a cleaned API which caves to nice further optimizations.

For example, I can imagine scenarios where an instance of PluralOperands gets stored and passed around for use in PluralRules, and proc_macros will allow us to construct the operands at build time and have them validated and calculated at runtime for free.

@filmil
Copy link
Contributor

filmil commented Jun 10, 2020

@zbraniecki first review done.

@zbraniecki
Copy link
Member Author

zbraniecki commented Jun 12, 2020

I want to also point out, especially for @sffc and @filmil that I am intentionally moving away from having minFractionDigits & friends on the PluralRules constructor.

Instead, I offer them on PluralOperands constructor, which is the input for the select.

This allows for the following use:

let pr = PluralRules::new(loc!("en"), PluralCategory::Ordinal);
pr.select(5); // used `Into<PluralOperands> for usize` - infallible

let operandsOptions = PluralOperandsOptions {
  min_fraction_digits: 3,
  ..Default::default()
};
let op = PluralOperands::new_with_options(5, operandsOptions);

pr.select(op); // now I passed Operands resolved with `minFractionDigits` 3

I think it's an improvement to have digits rounding and trimming when constructing Operands from input, because we're producing those Operands either implicitly for infallible, or explicitly for fallible or for cases where we want to go for non-default options.

How does it sound?

@zbraniecki zbraniecki force-pushed the pluralrules branch 3 times, most recently from b1fa8b6 to c8a1470 Compare June 16, 2020 18:02
@zbraniecki zbraniecki force-pushed the pluralrules branch 3 times, most recently from d9af87f to d7a0e40 Compare June 18, 2020 18:14
@zbraniecki zbraniecki force-pushed the pluralrules branch 2 times, most recently from d2fb9f4 to 7eded50 Compare June 25, 2020 06:52
@zbraniecki
Copy link
Member Author

Instead, I offer them on PluralOperands constructor, which is the input for the select.

After discussion with @sffc, I'm moving this to be based on FixedDecimal introduced in #141.
This seems like an even better solution for cases where the user wants to specify integer, significant and fraction digits via options.

@zbraniecki
Copy link
Member Author

@sffc - I think this is mostly ready. It works, and it implements 4 "data sources" - json, bincode, inline source and rust functions.

I'd like to plug this into DataProvider next, at least the json/bincode pathways, and leave the inline source/rs options for later likely.

Can you onramp me on how to write the binding based on my data/io/*?

@filmil
Copy link
Contributor

filmil commented Jul 27, 2020

@zbraniecki done as requested

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly looked at overall code architecture, not the specific lexer and selection algorithms. If you want me to look at the algorithms, I will do that.

I expect the data code to be imminently replaced, so I was not too picky about that.

components/pluralrules/benches/fixtures/mod.rs Outdated Show resolved Hide resolved
}
}

pub fn get_resource(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this is code I hope can be deleted once #182 is landed and wired up. Loading strings into a static JSON is hard to do correctly and should be done in a centralized place, not deeply embedded inside a component.

components/pluralrules/src/operands.rs Show resolved Hide resolved
components/pluralrules/src/operands.rs Outdated Show resolved Hide resolved
components/pluralrules/src/rules/mod.rs Show resolved Hide resolved
components/pluralrules/src/lib.rs Show resolved Hide resolved
components/pluralrules/src/lib.rs Show resolved Hide resolved
components/pluralrules/tests/fixtures/operands.json Outdated Show resolved Hide resolved
components/pluralrules/benches/pluralrules.rs Outdated Show resolved Hide resolved
@zbraniecki
Copy link
Member Author

Thank you for the first round of feedback! I applied most of it, and tried to keep comments open where I wanted the reviewer to verify my fix, while in trivial cases like spellings I closed it myself.

I left one open bit - the removal of conversion from f64 to PluralOperands, which I plan to remove before I merge it, but wanted to get another round of feedback and see if there's anything else left.

I'm going to open a number of issues based on all the comments - ranging from FixedDecimal and DataProvider integration, to certain optional performance optimizations (SmallVec) and other tweaks.

I'm looking now for anything that has not yet been mentioned and may require changes before this PR is mergable, so re-requesting reviews.

components/pluralrules/src/operands.rs Outdated Show resolved Hide resolved
components/pluralrules/benches/pluralrules.rs Outdated Show resolved Hide resolved
components/pluralrules/benches/pluralrules.rs Show resolved Hide resolved
components/pluralrules/src/lib.rs Show resolved Hide resolved
components/pluralrules/src/lib.rs Outdated Show resolved Hide resolved
components/pluralrules/src/lib.rs Show resolved Hide resolved
components/pluralrules/src/rules/ast.rs Outdated Show resolved Hide resolved
components/pluralrules/src/rules/ast.rs Outdated Show resolved Hide resolved
components/pluralrules/src/rules/ast.rs Outdated Show resolved Hide resolved
components/pluralrules/src/rules/resolver.rs Outdated Show resolved Hide resolved
@zbraniecki zbraniecki requested a review from sffc July 31, 2020 20:07
@zbraniecki
Copy link
Member Author

I think this is ready for re-review!

@zbraniecki zbraniecki requested a review from sffc August 1, 2020 07:59
sffc
sffc previously approved these changes Aug 7, 2020
components/pluralrules/src/rules/resolver.rs Show resolved Hide resolved
@zbraniecki
Copy link
Member Author

@filmil @echeran @nciric - ping?

@filmil
Copy link
Contributor

filmil commented Aug 7, 2020

I approved quite a while ago but am glad to reconfirm.

@zbraniecki zbraniecki merged commit c4dc724 into unicode-org:master Aug 12, 2020
@zbraniecki zbraniecki deleted the pluralrules branch October 19, 2020 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PluralRules
6 participants