You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We should treat Simplified Chinese and Traditional Chinese as two completely seperate languages. There's obviously more nuance to it than that, but echoing #6 and #80, its super important to distinguish between the two very different overarching types of Chinese.
It's close to saying "well both English and German look the same to me so they must be interoperable" 😆
A code of zh conventionally refers to Simplified Chinese.
# Common Simplified Chinese Codes
zh
zh-Hans
zh-CN (Mainland China variant - Historically used for all Simplified Chinese)
# Common Traditional Chinese Codes
zh-Hant
zh-TW (Taiwan variant - Historically used for all Traditional Chinese)
zh-HK (Hong Kong variant)
If it helps you from a training data point of view, they're two totally different ISO Language Scripts (Hans vs Hant).
The text was updated successfully, but these errors were encountered:
thank you for this clarification. I wasn't aware of the fact that simplified and traditional Chinese are as different as English and German, for instance. I will try to find better training data for each variant and let the library handle the variants separately.
We should treat Simplified Chinese and Traditional Chinese as two completely seperate languages. There's obviously more nuance to it than that, but echoing #6 and #80, its super important to distinguish between the two very different overarching types of Chinese.
It's close to saying "well both English and German look the same to me so they must be interoperable" 😆
A code of
zh
conventionally refers to Simplified Chinese.If it helps you from a training data point of view, they're two totally different ISO Language Scripts (
Hans
vsHant
).The text was updated successfully, but these errors were encountered: