Description
(edited to reduce scope)
/x/text/language
lets you query Region and Script:
func (t Tag) Region() (Region, Confidence)
func (t Tag) Script() (Script, Confidence)
There should also be a way to query the Number System that Go has matched for the locale. Currently, this can be done with the Extensions method on a language.Tag
but only if the number system has been specified explicitly in the locale string. Otherwise, there is no easy way to know what the default Number System chosen is.
This is probably best implemented by golang.org/x/text/number
exporting a function called "SystemFromTag". This should return a number.System, which should support the stringer interface.
This is already done in /x/text/internal/number
(InfoFromTag), and would be a trivial wrapper.
In future, a number.System
could export other useful information that the internal number.InfoFromTag
exposes, but this is not necessary for now.
Use case
This would let client code match the number system selected for a locale by /x/text, and if it needs any more information, look it up in the Unicode CLDR data files, which is a simple file at supplemental/numberingsystems.xml
.
Without this, there's the much more involved process of reimplementing parsing the locale string, and re-implementing the mapping of a locale to its default number system, including the hierarchy of parent locales.
Example Usage
package main
import (
"fmt"
"golang.org/x/text/language"
"golang.org/x/text/message"
"golang.org/x/text/number"
)
func main() {
ts := []language.Tag{
language.MustParse("en-GB"),
language.MustParse("en-GB-u-nu-fullwide"),
language.MustParse("ar"),
language.MustParse("ar-u-nu-latn"),
language.MustParse("ta"),
language.MustParse("ta-u-nu-taml"),
language.MustParse("ta-u-nu-tamldec"),
}
for _, t := range ts {
fmt.Printf("%s\n", t.String())
r, _ := t.Region()
fmt.Printf("%s, %s\n", r.String(), r.ISO3())
s, _ := t.Script()
fmt.Printf("%s\n", s.String())
message.NewPrinter(t).Println(number.Decimal(123456789))
// PROPOSED:
// n, _ := number.SystemFromTag(t)
// fmt.Printf("%s\n", n.String())
fmt.Println("---")
}
// Expected Outputs:
// en-GB
// GB, GBR
// Latn
// 123,456,789
// latn
// ---
// en-GB-u-nu-fullwide
// GB, GBR
// Latn
// 123,456,789
// fullwide
// ---
// ar
// EG, EGY
// Arab
// ١٢٣٬٤٥٦٬٧٨٩
// arab
// ---
// ar-u-nu-latn
// EG, EGY
// Arab
// 123,456,789
// latn
// ---
// ta
// IN, IND
// Taml
// 12,34,56,789
// latn
// ---
// ta
// IN, IND
// Taml
// 12,34,56,789 // taml is not a decimal format, so ignore this line
// taml
// ---
// ta-u-nu-tamldec
// IN, IND
// Taml
// ௧௨,௩௪,௫௬,௭௮௯
// tamldec
// ---
}
Example implementation
/x/text/number should change as follows:
// System holds information about a numbering system
type System struct {
info number.Info // from /x/text/internal/number
}
// SystemFromTag returns a Numbering System for the given language tag. If it
// was not explicitly given (e.g. "en-u-nu-mathbold"), it will infer a most
// likely candidate. This is subject to change.
func SystemFromTag(t language.Tag) System, Confidence {
// TODO select a Confidence
return number.info.InfoFromTag(t), confidence
}
// String returns the BCP 47 U Extension representation for the Number System Identifier.
func (s System) String() string {
....
}
Open questions
-
The documentation for Region.String says it returns "ZZ" for an unspecified region. Script.String returns "Zzzz" for an unspecified script. Would SystemFromTag ever fail to return a numbering system? Could it in the future? If so, what should that system's string representation be? Probably just the default, with an appropriate "No" confidence value?
-
/x/text/number currently doesn't support number system categories at all - e.g. "tamil-u-nu-native", "tamil-u-nu-traditio" or "zh-u-nu-finance" - only explicit matches e.g. "tamil-u-nu-tamldec". Should this be implemented first? It would probably impact the returned Confidence value. (See x/text/number: understands specific BCP-47 u-nu-extensions, but not general categories #54090)
References
Metadata
Metadata
Assignees
Type
Projects
Status