Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flink Support for TIMESTAMP_NANOS #11348

Closed
wants to merge 9 commits into from
Closed

Conversation

rodmeneses
Copy link
Contributor

@rodmeneses rodmeneses commented Oct 17, 2024

Supports the new TIMESTAMP_NANOS (v3 format) for Flink, including watermark generation

@rodmeneses rodmeneses changed the title Flink Support for TIMESTAMP_NANOS data type Flink Support for TIMESTAMP_NANOS data type for PARQUET Oct 17, 2024
@pvary
Copy link
Contributor

pvary commented Oct 18, 2024

Do the schema conversions handle the nano ts correctly? Do we get a Timestamp(9) when we convert to Flink schema, or a nano timestamp when we convert from the Iceberg schema?

@rodmeneses rodmeneses changed the title Flink Support for TIMESTAMP_NANOS data type for PARQUET Flink Support for TIMESTAMP_NANOS Oct 21, 2024
@rodmeneses rodmeneses marked this pull request as draft October 21, 2024 21:38
@github-actions github-actions bot added the core label Oct 21, 2024
@rodmeneses rodmeneses marked this pull request as ready for review October 22, 2024 21:01
@rodmeneses
Copy link
Contributor Author

Do the schema conversions handle the nano ts correctly? Do we get a Timestamp(9) when we convert to Flink schema, or a nano timestamp when we convert from the Iceberg schema?

👍🏻

case TIMESTAMP_NANO:
Types.TimestampNanoType timestampNano = (Types.TimestampNanoType) primitive;
if (timestampNano.shouldAdjustToUTC()) {
// NANOS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be consistent with current comments already present for MICROS, ie:

 if (timestamp.shouldAdjustToUTC()) {
          // MICROS
          return new LocalZonedTimestampType(6);
        } else {
          // MICROS
          return new TimestampType(6);
        }

// NANOS
return new LocalZonedTimestampType(9);
} else {
// NANOS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls. remove the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see above

@Override
public TimestampData read(TimestampData ignored) {
long value = readLong();
return TimestampData.fromInstant(Instant.ofEpochSecond(0, value));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could we cross check, that the Spark write/Flink read and Flink write/Spark read is correctly working?
Are we sure that this is the expected data format?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the generic java writer tests?


LocalDateTime localDateTime = (LocalDateTime) r.get(0);
minValues.merge(
"timestamp_column", localDateTime.toInstant(ZoneOffset.UTC).toEpochMilli(), Math::min);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only comparing millis - shall we do a testing with nanos?

@pvary
Copy link
Contributor

pvary commented Oct 23, 2024

There are several tests which are using DataGenerator to generate test data - do we want to add nanos to them?

Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Nov 24, 2024
Copy link

github-actions bot commented Dec 1, 2024

This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Dec 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants