Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Date]TimeFormatter constructors don't take hour12 override #3881

Open
hsivonen opened this issue Aug 17, 2023 · 15 comments
Open

[Date]TimeFormatter constructors don't take hour12 override #3881

hsivonen opened this issue Aug 17, 2023 · 15 comments
Assignees
Labels
C-datetime Component: datetime, calendars, time zones S-small Size: One afternoon (small bug fix or enhancement) T-core Type: Required functionality U-ecma402 User: ECMA-402 compatibility

Comments

@hsivonen
Copy link
Member

ECMA-402 has hour12 and hourCycle overrides for the locale's hour cycle. AFAICT, ICU4X [Date]TimeFormatter constructors don't have API surface for this override. For ECMA-402 compat, ICU4X should have API surface for this.

@hsivonen hsivonen added C-datetime Component: datetime, calendars, time zones U-ecma402 User: ECMA-402 compatibility labels Aug 17, 2023
@hsivonen
Copy link
Member Author

Maybe hourCycle is available via -u-hc-? Still leaves hour12 (which happens to be what appears in various preference UIs).

@hsivonen
Copy link
Member Author

hourCycle is in icu_datetime::options::components::Bag.

@hsivonen hsivonen changed the title [Date]TimeFormatter constructors don't take hour12 / hourCycle override [Date]TimeFormatter constructors don't take hour12 override Aug 17, 2023
@sffc
Copy link
Member

sffc commented Aug 17, 2023

-u-hc- works.

I don't think we have hour12 except that you can mostly get that via -u-hc- but you need to choose between h11/h12 or h23/h24 instead of letting the locale choose.

@hsivonen
Copy link
Member Author

Letting the locale affect hour12 expansion to hourCycle seems required for ECMA-402 compliance.

AFAICT, by the time the [Date]TimeFormatter constructor has done the provider-level resolution of the locale data, there doesn't appear to be a way for the application code to modify the instance to perform hour12 resolution in application code.

@hsivonen
Copy link
Member Author

macOS Ventura and Gnome expose a boolean system pref for this topic, so hour12 might be relevant to support to be able to honor system preferences. I don't have test code that I'd know would behave according to the system boolean pref semantics. However, from reading the ECMA-402 spec, I'm a bit surprised at how the ECMA-402 semantics are supposed to work. AFAICT, hour12=false for en-US would resolve to h24. Do users actually want that result instead of h23? (As a user of the en-US locale for untranslated strings with 24-hour clock enabled, I don't want h24. I guess I'll need to make a TODO item of observing the system clock at midnight.)

@hsivonen
Copy link
Member Author

Looking at https://github.com/unicode-org/cldr-json/blob/80a94b0f6c3a34d6e2dc0dca8639a54babc87f94/cldr-json/cldr-core/supplemental/timeData.json#L4 , I observe:

  1. The preferred cycle for each locale is either h or H, i.e. h12 or h23.
  2. k and K do not appear in allowed cycles. hb and hB do, but I don't find a spec explaining what they mean.

Given that the preferred cycle for each locale is either h12 or h23, it's unclear to me what problem h11 and h24 or hour12 expanding to h11 or h24 solve.

@anba
Copy link

anba commented Aug 18, 2023

However, from reading the ECMA-402 spec, I'm a bit surprised at how the ECMA-402 semantics are supposed to work.

The current spec is incorrect. There's a PR to fix this → tc39/ecma402#758.

@hsivonen
Copy link
Member Author

However, from reading the ECMA-402 spec, I'm a bit surprised at how the ECMA-402 semantics are supposed to work.

The current spec is incorrect. There's a PR to fix this → tc39/ecma402#758.

Thanks. Given that change and the non-existence of k-default and K-default locales, I guess one option would be to have ICU4X ECMA-402 wrapper code hard-code hour12=true to h12 and hour12=false to h23, and close this API request as WONTFIX.

@anba
Copy link

anba commented Aug 18, 2023

2. k and K do not appear in allowed cycles. hb and hB do, but I don't find a spec explaining what they mean.

Japan has Khttps://github.com/unicode-org/cldr/blob/343bde9e7e8d6cf6f2c57e257fa4f074df970311/common/supplemental/supplementalData.xml#L4888

b and B are day period markers: https://unicode.org/reports/tr35/tr35-dates.html#dfst-period

@hsivonen
Copy link
Member Author

  1. k and K do not appear in allowed cycles. hb and hB do, but I don't find a spec explaining what they mean.

Japan has Khttps://github.com/unicode-org/cldr/blob/343bde9e7e8d6cf6f2c57e257fa4f074df970311/common/supplemental/supplementalData.xml#L4888

Oops. I missed that. So: K is allowed in one locale but isn't the default anywhere and k is specced for completeness and isn't in use anywhere?

b and B are day period markers: https://unicode.org/reports/tr35/tr35-dates.html#dfst-period

Thanks.

@anba
Copy link

anba commented Aug 18, 2023

Oops. I missed that. So: K is allowed in one locale but isn't the default anywhere and k is specced for completeness and isn't in use anywhere?

Yes. K isn't the default hour-cycle for Japan per <timeData>/<hours>, but when selecting {hour: "numeric", hour12: true}, the resolved pattern will contain K, see here. That also means it's not possible to replace hour12=true with hourCycle=h12.


For example new Intl.DateTimeFormat("en", {hour:"numeric"}) can be customised as follows:

Options Skeleton Resolved Pattern Final Pattern
{hour:"numeric"} j h a h a
{hour:"numeric", hour12: true} h h a h a
{hour:"numeric", hour12: false} H HH HH
{hour:"numeric", hourCycle: "h11"} h h a K a
{hour:"numeric", hourCycle: "h12"} h h a h a
{hour:"numeric", hourCycle: "h23"} H HH HH
{hour:"numeric", hourCycle: "h24"} H HH kk

And new Intl.DateTimeFormat("ja", {hour:"numeric"}) can be customised as follows:

Options Skeleton Resolved Pattern Final Pattern
{hour:"numeric"} j H時 H時
{hour:"numeric", hour12: true} h aK時 aK時
{hour:"numeric", hour12: false} H H時 H時
{hour:"numeric", hourCycle: "h11"} h aK時 aK時
{hour:"numeric", hourCycle: "h12"} h aK時 ah時
{hour:"numeric", hourCycle: "h23"} H H時 H時
{hour:"numeric", hourCycle: "h24"} H H時 k時

In an input skeleton, h is automatically matched to either h or K in the resolved pattern. Similarly, H is matched to either H or k.

Spec:


The allowed strings in <timeData>/<hours> are mostly relevant for the C skeleton, so it's not yet relevant ECMA-402 date-time formatting. (Spec: https://unicode.org/reports/tr35/tr35-dates.html#availableFormats_appendItems)

They're possibly relevant for the stage-3 "Intl Locale Info" proposal. There's a HourCyclesOfLocale operation, which is spec'ed to return the hour-cycle formats which are in "common use for date and time formatting". So this operation could return the allowed values from <timeData>/<hours>.

ICU4C doesn't have a public API to retrieve the allowed values, though. Instead it's necessary to manually read the resource data, cf. DateTimeFormat::GetAllowedHourCycles.

@hsivonen
Copy link
Member Author

Yes. K isn't the default hour-cycle for Japan per <timeData>/<hours>, but when selecting {hour: "numeric", hour12: true}, the resolved pattern will contain K, see here. That also means it's not possible to replace hour12=true with hourCycle=h12.

Thanks. So ICU4X is currently missing a way to handle hour12 in a data-driven way.

Just so that I understand the feasibility of hard-coded special cases if this issue isn't addressed in ICU4X itself: It would be possible for ECMA-402 implementation glue code to get correct results (with the scope of what's known about what is in CLDR) by expanding the boolean hour12 and the boolean "region is JP" to hourCycle, right?

That is:

if hour12 {
  if region_of_locale_is_JP {
    h11
  } else {
    h12
  }
} else {
  h23
}

They're possibly relevant for the stage-3 "Intl Locale Info" proposal. There's a HourCyclesOfLocale operation, which is spec'ed to return the hour-cycle formats which are in "common use for date and time formatting". So this operation could return the allowed values from <timeData>/<hours>.

The rendered spec that you linked to has HourCyclesOfLocale, but the README claims "Hour Cycle DROPPED by Champion". @FrankYFTang , is the current intention to include or exclude HourCyclesOfLocale?

ICU4C doesn't have a public API to retrieve the allowed values, though. Instead it's necessary to manually read the resource data, cf. DateTimeFormat::GetAllowedHourCycles.

I don't see any non-test callers for that method. What am I missing?

@anba
Copy link

anba commented Aug 21, 2023

Just so that I understand the feasibility of hard-coded special cases if this issue isn't addressed in ICU4X itself: It would be possible for ECMA-402 implementation glue code to get correct results (with the scope of what's known about what is in CLDR) by expanding the boolean hour12 and the boolean "region is JP" to hourCycle, right?

It needs to be hard-coded on the language, not the region, because the date-time patterns are in https://github.com/unicode-org/cldr/blob/main/common/main/ja.xml.

I don't see any non-test callers for that method. What am I missing?

Only the parts relevant for the "Unified Intl API" work (bug 1686965) have been committed in bug 1693576. The rest will be put up for review when the open issues in the proposal have been resolved.

@sffc
Copy link
Member

sffc commented Aug 21, 2023

There is some interesting code to handle some of this resolution logic in components/datetime/src/pattern/hour_cycle.rs

I also observe that we already have the preferred hour cycle (h11h12 or h23h24) in ICU4X data: https://github.com/unicode-org/icu4x/blob/main/provider/datagen/tests/data/json/datetime/timelengths%401/en.json\

So I think everything is here to support hour12 if we were to add it to an options bag somewhere.

@jufemaiz
Copy link

Just to make this clearer for those playing along here. Japan is the only country that allows support for the use of K value for times.

<hours preferred="H" allowed="H K h" regions="JP"/>

Ref: https://github.com/unicode-org/cldr/blob/343bde9e7e8d6cf6f2c57e257fa4f074df970311/common/supplemental/supplementalData.xml#L4888

The options are h, H, K, k and are defined as such:

image https://unicode.org/reports/tr35/tr35-dates.html#dfst-hour

Currently ECMA spec incorrectly assumes a coupling of h-k and H-K. That is the following is baked in as implicit assumption:

  • twelve hour time presented with hours 0-11/00-11 (K) will present twenty four hour time as 00-23 (H)
  • twelve hour time presented with hours 1-12/01-12 (h) will present twenty four hour time as 01-24 (k)

The ECMA standard definitely needs to change as the current implementation is a bug. The universal (as far as I've been able to determine) rejection of k, and the only occasional adoption of K as an option renders the above assumption absolutely incorrect, and realistically should have been identified prior to publication. tc39/ecma402#758 has identified a solution that expands how 12-hour and 24-hour time is presented at a regional level. Work is ongoing to get this to a point of acceptance. This is slated for 2023-09 TC39 meeting.

@sffc sffc added the S-small Size: One afternoon (small bug fix or enhancement) label Sep 21, 2023
@sffc sffc added this to the 1.x Priority ⟨P2⟩ milestone Sep 21, 2023
@sffc sffc added the T-core Type: Required functionality label Sep 21, 2023
@sffc sffc self-assigned this Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-datetime Component: datetime, calendars, time zones S-small Size: One afternoon (small bug fix or enhancement) T-core Type: Required functionality U-ecma402 User: ECMA-402 compatibility
Projects
None yet
Development

No branches or pull requests

4 participants