Replies: 3 comments 1 reply
-
I agree @LPX55 - we as maintainers need to be more vigilant about this. We currently have a This is as important in code reviews - we as an engineering workstream need to be judicious about allowing new keys to be added. There is a second issue we have, too: the structure of the JSON object is inconsistent - each section and feature has its own convention. E.g. sometimes modals all come under one key, with sub-keys such as We could write up a bounty to get this all fixed up - we'd then need to stay on top of ensuring it does not regress. Maybe any PR that touches a language file requires a review from the Globalization workstream? |
Beta Was this translation helpful? Give feedback.
-
All good points @0xApotheosis and @LPX55 I agree the state of the translations files are a bit of a mess. Adding an additional approval required from a workstream not under our purview seems like we may struggle to get anything merged. We can certainly tag them for a glance if they're available but I don't think we should hold up PRs. I've spoken a few times about a simple script, could even run on CI, that treats the structure of the english translations as the source of truth, and identifies missing keys/values in the other translations. We could easily have a Ideally we get someone to spike on evaluating a tool that can do this, i.e. english is our source of truth, here's the master file, we have n target languages, please update the targets as the master changes. We should source translations from anyone on the internet. We shouldn't really be limited to the globalization workstream, bar some industry specific jargon. @LPX55 you're a polyglot (and can code too) - given that the globalization workstream initially proposed this stuff and provided the translations, would you be willing to do this spike for a service that we can rely on to help maintain this? It was never really on our original roadmap, but is definitely down your alley. |
Beta Was this translation helpful? Give feedback.
-
Maybe this goes too far for this refactoring but from my previous experience with CMS and specifically Drupal, the locale strings for the UI are handled using the full English string as the "key" and an additional "context" is provided in order to avoid both repetitions and confusion for similar terms that have different meanings. This context is specified only if the word can be ambiguous. They explain it well in their documentation, their example with the word "May" exposes why it is useful in ambiguous cases. The idea of handling similar terms in a "common" key makes me think it would be hard to track the actual context where the string will be used or if there is an ambiguity. In the end for everyone involved (developers/translators/users) it's what matters, what the word/sentence used is supposed to mean in a specific context... or if it's a generic word without context. In this system strings that are untranslated or miss a translation for a specific context are returned as is (in English), so it covers the cases when a release happens but not every language could be translated. They do use a different way to handle the extraction/translation of strings (.po files, which is interesting but most likely out of scope here), but I think the concept could probably apply here too while still using JSON files, there doesn't seem to be many limitations to what key in JSON can consist of and as far as I know same applies for Object property names in JS. It also makes it much more "readable" for developers because they don't have to check what the key refers too; it's already expressed in plain English. On a side note, this Drupal documentation page illustrates something that might come in handy at some point for us: plural forms. For the current strings I have translated there were a couple of instances where it could have been useful. The more we will actually need to display counts/amounts and plurals are needed for nouns/verbs, the more it will cause almost similar duplicated strings if we don't support them out of the box. But maybe this should be handled in a separate discussion/issue though. |
Beta Was this translation helpful? Give feedback.
-
The locale files are beginning to get unnecessarily large, and it seems that new strings are just being appended mindlessly; this causes confusion for the translators, and the ever-growing size of the file is beginning to get harder to manage even with a i18n/l10n translation platform.
There are also keys and strings that come up repetitively throughout the file, especially commonly used terms that are used throughout the platform. The following is a small list of many terms that we have been seeing come up again and again:
etc.
I propose a general "glossary" of commonly used terms that have little to no chance of changing meanings in context, as well as possibly beginning to split up these locale files in a logical pattern. This will require some back-tracking from the developers to find the location of the locale key and updating it, which sounds like a pain in the ass, but the continual unchecked growth of this single file is unsustainable and will likely cause more headaches in the future.
Open to ideas and suggestions.
Beta Was this translation helpful? Give feedback.
All reactions