Skip to content

Time-travel & Incremental API to support more timestamp formats #261

@xushiyan

Description

@xushiyan

Description of the improvement

Currently Table::read_incremental_records() expects start and end timestamps to be in the Hudi timeline format, i.e., yyyyMMddHHmmSSSSS or the legacy yyyyMMddHHmmSS. We should support parsing more timestamp formats.

Sames goes for Table::read_snapshot_as_of() and other *as_of() APIs.

Expected behavior

  • To support parsing strings in the form of epoch time (including seconds, milliseconds, microseconds, nanoseconds) and convert to Hudi timeline format for further processing.

  • To support parsing strings in the form of ISO8601 format like

    • 2019-01-23T12:34:56.123456789+00:00
    • 2019-01-23T12:34:56.123456+00:00
    • 2019-01-23T12:34:56.123+00:00
    • 2019-01-23T12:34:56+00:00
    • 2019-01-23T12:34:56Z and other precisions like above
    • 2019-01-23

Additional context

The conversion to Hudi timeline format should consider the hudi table's timeline timezone config.

Metadata

Metadata

Assignees

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions