Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DateTimeFormatter still lacks power user APIs #3347

Open
sffc opened this issue Apr 19, 2023 · 9 comments
Open

DateTimeFormatter still lacks power user APIs #3347

sffc opened this issue Apr 19, 2023 · 9 comments
Assignees
Labels
A-design Area: Architecture or design C-datetime Component: datetime, calendars, time zones U-google User: Google 1st party

Comments

@sffc
Copy link
Member

sffc commented Apr 19, 2023

In #380 and the corresponding design doc, I suggested that DateTimeFormatter have multiple layers:

At a very basic level, I think we should split DateTimeFormat as follows:

  • DateTimeFormat: top-level, all-inclusive kitchen sink

    • Contains one or more DateTimePatternInterpolator and one or more TimeZoneFormat
    • Contains constructors for each of the types of DateTimeFormat we want to build
  • DateTimePatternInterpolator: formats a specific pattern

    • Contains a DataPayload for a single pattern being used (zero-alloc when possible)
    • Does not support time zones directly; needs a TimeZoneFormat
  • TimeZoneFormat: formats a time zone

    • (already done)

Internally, we will likely want additional classes to do things like skeleton resolution. This does not need to be public API. An argument could be made that DateTimePatternInterpolator should be internal as well.

It seems this got dropped and not resolved in that older issue.

@sffc sffc added A-design Area: Architecture or design discuss Discuss at a future ICU4X-SC meeting C-datetime Component: datetime, calendars, time zones labels Apr 19, 2023
@sffc
Copy link
Member Author

sffc commented Apr 21, 2023

Relatedly: I'm warming up to the idea that the DateTimePattern constructor should resolve the pattern parts into a Vec or SmallVec. It will make the formatting code much more straightforward, and most of the time these patterns will be fewer than ~10 PatternItems, so they can be easily stored on the stack in a SmallVec.

@sffc sffc added the U-google User: Google 1st party label Apr 22, 2023
@sffc
Copy link
Member Author

sffc commented May 11, 2023

Discuss with:

Optional:

@sffc
Copy link
Member Author

sffc commented May 13, 2023

The new API should enable the locale-specific symbols to be pre-loaded but provide a pattern at runtime. This is not dissimilar to the approach we're taking with FixedDecimalFormatter, where we push locale-invariant settings from construction time into format time.

@sffc sffc added the discuss-triaged The stakeholders for this issue have been identified and it can be discussed out-of-band label May 25, 2023
@sffc
Copy link
Member Author

sffc commented Jun 28, 2023

Tentative conclusion:

  1. DateTimePatternInterpolator does not own a pattern, but it has Option<DataPayload> fields for all the datetime symbols
  2. It has &mut self loading functions to fill in the slots
  3. It takes either a &runtime::Pattern or a impl Iterator<Item = PatternItem> in the terminal function, returning a FormattedDateTime type return value
  4. Wait until icu_pattern is landed before implementing this.

LGTM: @sffc @zbraniecki (@robertbastian) (@Manishearth)

@sffc
Copy link
Member Author

sffc commented Oct 26, 2023

Additional notes on the design:

  • NeoDateTimeFormatter owns a DateTimePatternInterpolator and 1 or more runtime::Patterns. In the format function, it selects the correct pattern for the datetime input and returns a NeoFormattedDateTime that borrows the pattern, the input, and the interpolator data.
  • Alternative naming: DateTimeSymbolBag (for the thing that owns symbols) and DateTimePatternInterpolator (for the thing that borrows symbols and a pattern)
  • We considered adding a type that owns a single pattern and an interpolator, but its use cache is niche and it can just be served by NeoDateTimeFormatter.
  • An interesting proposition for DateTimePatternInterpolator is the following API:
struct NeoDateTimeFormatter {
    interpolator: DateTimePatternInterpolator,
    patterns: ShortVec<runtime::Pattern>
}

impl NeoDateTimeFormatter {
    pub fn format<'l>(&'l self, input: impl DateTimeInput) -> NeoFormattedDateTime<'l> {
        // First: select the correct pattern. OK to special-case when there is only a single pattern.
        self.interpolator.format_with_pattern(&pattern, input)
    }

impl DateTimePatternInterpolator {
    // One of these per DataKey:
    pub fn load_month_symbols(&mut self, provider: &P, length: FieldLength) -> Result<&mut Self> {
        // ...
    }

    // also include_month_symbols for the compiled_data version

    pub fn load_for_pattern<'a>(&'a mut self, provider: &P, pattern: &'a runtime::Pattern) -> Result<DateTimePatternInterpolatorForPattern<'a>> {
        // load the data, then pack the references into the return value
    }
    
    // compiled_data: include_for_pattern
    
    pub fn with_pattern<'a>(&'a self, pattern: &'a runtime::Pattern) -> Result<DateTimePatternInterpolatorWithPattern<'a>> {
        // returns an error if the data isn't loaded
    }

    pub fn format_with_pattern<'l>(&'l self, pattern: &'l runtime::Pattern, input: impl DateTimeInput) -> NeoFormattedDateTime<'l> { ... }
}

struct DateTimePatternInterpolatorWithPattern<'a> {
    pattern: &'a runtime::Pattern,
    interpolator: DateTimePatternInterpolatorBorrowed<'a>,
}

impl<'a> DateTimePatternInterpolatorWithPattern<'a> {
    pub fn format<'l>(&'l self, input: impl DateTimeInput) -> NeoFormattedDateTime<'l> {
        // pack the additional references into NeoFormattedDateTime
    }
}

The call site would look like this:

// Option 1: Use the same interpolator for multiple patterns (always in the same locale and calendar system)
let mut interpolator = DateTimePatternInterpolator::<Gregorian>::new(&locale!("en").into());
interpolator.load_with_pattern(provider, "M/d/y".parse().unwrap())
    .unwrap()
    .format("2023-10-25".parse().unwrap())
    // this won't fail if using an ICU4X impl of DateTimeInput:
    .write_to_string();
interpolator.load_with_pattern(provider, "MMM d, y".parse().unwrap())
    .unwrap()
    .format("2023-10-25".parse().unwrap())
    .write_to_string();

// Option 2: All in a single line (less cached data)
DateTimePatternInterpolator::<Gregorian>::new(&locale!("en").into())
    .load_with_pattern(provider, "MMM d, y".parse().unwrap())
    .unwrap()
    .format("2023-10-25".parse().unwrap())
    .write_to_string();

// Option 3: Without the intermediate type:
let mut interpolator = DateTimePatternInterpolator::<Gregorian>::new(&locale!("en").into());
let pattern = "MMM d, y".parse().unwrap();
interpolator.load_with_pattern(provider, &pattern).unwrap();
// Here we don't have a place to keep the invariant that we have loaded the correct symbols, so this needs to be fallible (with a core::fmt::Error):
interpolator.format_with_pattern(&pattern, "2023-10-25".parse().unwrap());

Any additional feedback before I implement this?

@sffc sffc added this to the 1.4 Blocking ⟨P1⟩ milestone Oct 26, 2023
@sffc sffc added discuss-priority Discuss at the next ICU4X meeting and removed discuss Discuss at a future ICU4X-SC meeting discuss-triaged The stakeholders for this issue have been identified and it can be discussed out-of-band labels Oct 26, 2023
@sffc
Copy link
Member Author

sffc commented Oct 26, 2023

  • @robertbastian - new should move the locale, or else it should borrow the locale with a lifetime
  • @sffc - This is not consistent with TimeZoneFormatter but I'm fine diverging (and perhaps making a 2.0 to fix TimeZoneFormatter)
  • @robertbastian - TZF takes a reference and clones, that's the worst solution out of these 3

@sffc
Copy link
Member Author

sffc commented Oct 26, 2023

Naming bikeshed for DateTimePatternInterpolator:

  1. DateTimeSymbols
  2. DateTimeSymbol[s]Bag
  3. DateTimeData
  4. DateTimeDataBag
  5. DateTimeSymbol[s]Cache
  6. DateTimeSymbol[s]Loader
  7. DateTimeSymbol[s]Holder
  8. DateTimePatternFormatter
  9. DateTimeSymbol[s]Store
  10. DateTimeFormatInfo
  11. DateTimePatternSymbols
  • @sffc - Not a fan of DateTimeSymbols due to ICU4C baggage with that name. Not a fan of Bag because we use that word for options bags. Not a fan of Cache because we advertise caches as something that lives outside of ICU4X. Store is okay but we use that word as a trait name in other places.
  • @robertbastian - Not a fan of Data because it's a filler word
  • @Manishearth - It should be named for what it's for, not what it's used internally. People should find it if they want to format patterns directly. The key thing is patterns, not sumbols.

Will temporarily go with:

  • DateTimePatternSymbols which owns symbols
  • DateTimePatternFormatter which borrows the symbols and the pattern

LGTM: (@sffc) (@Manishearth)

@sffc
Copy link
Member Author

sffc commented Nov 21, 2023

My PR #4204 uses the name TypedDateTimePatternInterpolator as proposed in the OP. It doesn't do everything proposed yet so the name is still open.

Some other ideas:

  • DateTimeNames
  • DateTimeSymbolData
  • DateTimeFormatterData
  • BorrowedDateTimeNamesWithPattern

@sffc
Copy link
Member Author

sffc commented Dec 20, 2023

Also we can add helper functions on DateTimeNames to get specific symbols like the AM/PM symbol.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-design Area: Architecture or design C-datetime Component: datetime, calendars, time zones U-google User: Google 1st party
Projects
Status: Being worked on
Development

No branches or pull requests

1 participant