Skip to content

Conversation

@yuxiqian
Copy link
Member

@yuxiqian yuxiqian commented Jul 1, 2024

This is a workaround for FLINK-35615.

Currently, MySQL source could not obtain the precise timestamp precision of data field types, and used a intuitive method to infer it by checking data record precision as follows:

// DebeziumSchemaDataTypeInference.java
int precision;
if (nano == 0) {
    precision = 0;
} else if (nano % 1000 > 0) {
    precision = 9;
} else if (nano % 1000_000 > 0) {
    precision = 6;
} else if (nano % 1000_000_000 > 0) {
    precision = 3;
} else {
    precision = 0;
}

Thus, the precision of timestamp might be underestimated. This would cause a binary encoding / decoding issue, since CDC will store timestamps with <= 3 precision in compact, and allocate extra memory segments for high-precision timestamps. So, timestamps with mismatched precision would cause OOB memory access.

This PR disables such low-precision timestamp storage optimization for timestamps to ensure binary records always could be correctly decoded. The cost is now each timestamp records requires 12 bytes instead of 8 bytes.


cc @Jiabao-Sun @loserwang1024

@yuxiqian yuxiqian marked this pull request as draft July 5, 2024 09:54
@yuxiqian
Copy link
Member Author

yuxiqian commented Aug 8, 2024

Addressed in #3511.

@yuxiqian yuxiqian closed this Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant