-
-
Notifications
You must be signed in to change notification settings - Fork 8.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(v2): fix bad theme pluralization rules for some labels #4304
Conversation
[V1] Deploy preview success Built with commit d313d8c |
Size Change: 0 B Total Size: 532 kB ℹ️ View Unchanged
|
Deploy preview for docusaurus-2 ready! Built with commit d313d8c |
|
||
// Not ideal but good enough! | ||
// See doc of usePluralFormSector for reason! | ||
const BlogPostPlurals = (count) => ({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slorber actually a interesting and at the same time elegant native-based solution, thanks! Is it possible to "combine" all these plurals forms (options) into one message via (for example) some separator or extending of placeholder with certain keyword? Do you think it is worth it, or will it increase code complexity and its final size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lex111
I just upgraded the code with something that you may prefer, let me know if that looks fine to you
You can provide the Russian message with syntax similar to your Vue I18n example doc, like:
"theme.blog.post.plurals": "Few posts|Many posts|One post|{count} posts",
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slorber wow! This is very cool, thanks, I checked locally, and it works fine! 👍 Only I am confused, why Intl for Ru locale defined 4 plurals forms, but not three? And If so, can I skip the last part ({count} записи|{count} записей|{count} запись
instead of {count} записи|{count} записей|{count} запись|{count} записей
)?
By the way, how do we handle zero values? It looks like at this time when passing zero values, interpolation does not work, do we need to define fallback for this case?
count = 0;
return selectMessage(
count,
translate(
{
id: 'theme.blog.post.plurals',
description: 'Pluralized label for one blog post',
message: 'One post|{count} posts',
},
{count},
),
);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm curious why you decided to go with a custom syntax instead of ICU? Bc ICU handles cases like 0
natively e.g {count, plural, =0 {no post} one {one post} other {# posts}}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lex111
Won't be able to answer your questions regarding the Russian language plural forms but it seems there are 4 forms: one few many other.
https://www.unicode.org/cldr/cldr-aux/charts/34/supplemental/language_plural_rules.html
(will probably have to set a constant order, not sure if the order of that array is specified?)
By the way, how do we handle zero values?
It looks like Russian handles 0 values as the "many" plural category, do you see a problem using the many plural for zeros? Like showing a label such as "no post" but in Russian?
(we don't have the case for blog posts as count >= 1, but we may need this for search results)
It looks like at this time when passing zero values, interpolation does not work
I will take a look 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@longlho thanks for joining this review :)
Would appreciate your opinion on the choices I made.
Just to explain better the context:
- most Docusaurus sites are not translated
- most Docusaurus sites use a language with simple plural forms like English
- most Docusaurus sites don't have very complex custom pages, and if they do, the data is quite static and does not have any dynamic plural to handle
- we only have 2 labels in the classic theme to get translated
- we want a very lightweight solution
hmm curious why you decided to go with a custom syntax instead of ICU?
Because as far as I understand it, using ICU means supporting a larger set of i18n features like genders, number formatting etc, which most Docusaurus sites don't need, and require an ICU parser, reading CLDR data by using a parser too, and the associated Intl polyfills.
The current implementation gets the job done with less than 100 lines of code, and I think it's good enough for our use-case as we basically just want to pluralize just 2 labels:
- "{n} blog posts"
- "{n} search results"
Using react-intl
would add a decent amount of JS to the hello-world Docusaurus site that wouldn't benefit from it much. I considered using intl-messageformat
but thought it was overkill for our need and a simpler solution could work.
Note: using the syntax based on |
is not something we'll expose as a public API. It really is used to just translate 2 theme labels, not n user-provided labels. We won't tell the user that this even exist.
Instead we'll ask the user to add react-intl
to Docusauurs himself if he needs complex plurals in his homepage. But Home pages are generally fully static so it's quite unlikely this will be often needed.
Also, if too many users ask for pluralization for their custom pages, we can still add this later (we can move from a subset of ICU to full ICU, using react-intl
FormattedMessage
as an implementation detail, but I doubt many users will ask for this, and prefer to see if there is any traction at all before increasing the base size of a Docusaurus site (we could also add a config so that full ICU support is opt-in).
For me, the conditions for which a user might really benefit from full ICU is:
- They have custom pages
- They display dynamic labels on those pages (ie fetching data from a backend or a CMS? not the typical Docusaurus site)
- They handle multiple languages
Let me know if that makes sense to you.
If you tell me there's a way to support full ICU in less than 1kb, I'm open to any suggestion, but I think the min gz is more around 10kb (not including the polyfills, which will become less of a problem with polyfill services and browser support increasing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok that makes sense. In which case I don't think you should concat |
but rather separate them out like gettext
style basically (so have 1 blog posts
& {n} blog posts
as 2 separate strings).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@longlho this is what I initially had, but this is annoying to maintain and less convenient:
const BlogPostPlurals = (count) => ({
one: translate(
{
id: 'theme.blog.post.plurals.one',
description: 'Pluralized label for one blog post',
message: 'One post',
},
{count},
),
two: translate(
{
id: 'theme.blog.post.plurals.two',
description: 'Pluralized label for two blog posts',
message: '{{count}} posts',
},
{count},
),
few: translate(
{
id: 'theme.blog.post.plurals.few',
description: 'Pluralized label for few blog posts',
message: '{{count}} posts',
},
{count},
),
many: translate(
{
id: 'theme.blog.post.plurals.many',
description: 'Pluralized label for many blog posts',
message: '{{count}} posts',
},
{count},
),
other: translate(
{
id: 'theme.blog.post.plurals.other',
description: 'Pluralized label for other blog posts',
message: '{{count}} posts',
},
{count},
),
});
// Very simple pluralization: probably good enough for now
function useBlogPostPlural(count: number): string {
const pluralForm = usePluralForm(count);
return BlogPostPlurals(count)[pluralForm];
}
The reason is because I'd like to have all the theme translations to be statically analysable (ie no key like key.${pluralForm}
.
The goal is to see if we can later add a babel plugin that removes the translation runtime and the messages.json bundle. IE the final bundle would not contain EN in code + FR in messages.json, but just FR in code, and we won't have to put all the messages in a single chunk or figure out a way to do i18n messages code splitting. This is not a high-priority optimization but it would be nice if we have the right constraints now to see if it's worth to work on this later.
This is not ideal but for 2 or 3 labels and a private API that do not seem to me like a big deal :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I mean you'd still declare only one
& other
, translates to multiple cases like ru
and runtime knows to swap that in. Key can still be static but effectively you'd deal w/ an array of strings instead of a single string. That's what gettext
basically does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will keep current system for now and see if we need to change it later. as long as it's not public that should not be too hard to change if this becomes a problem
⚡️ Lighthouse report for the changes in this PR:
Lighthouse ran on https://deploy-preview-4304--docusaurus-2.netlify.app/classic/ |
…vide the translations
Motivation
As discussed previously in #4295
We need to handle pluralization of some labels.
Some languages like Russian have complex rules.
Most Docusaurus sites will be in English, or will use languages with simpler pluralization rules, so it wouldn't be fair to add unnecessary JS code that wouldn't be used by 95% of the Docusaurus sites
We have only 3 Docusaurus labels for which we need to handle pluralization.
We don't want to introduce too much complexity just for that, so we really are looking for a good enough solution here.
https://v2.docusaurus.io/blog/tags/recap :
https://v2.docusaurus.io/search?q=test
The rest of the theme is quite "static", and users are unlikely to have too much dynamic values either, unless they implement their own plugins or fetch remote backend data that is dynamic. In those cases the user should decide if he ants
This solution is the smallest one I could think of to solve the problem and get the theme labels translated properly by default.
The tradeoff is that older browsers without Intl.PluralRules support will fallback to English rules, so a Russian user on IE11 or older Edge will only get an approximate result. In my opinion it is not a big deal.
Test
https://deploy-preview-4304--docusaurus-2.netlify.app/classic/blog/tags/recap
https://deploy-preview-4304--docusaurus-2.netlify.app/classic/fr/blog/tags/recap
https://deploy-preview-4304--docusaurus-2.netlify.app/classic/ru/blog/tags/recap
Will have to test better some older browsers to see the potential impact