-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(datetime): enhance datetime parsing and validation #2129
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,8 +3,12 @@ | |
import com.zendesk.maxwell.producer.MaxwellOutputConfig; | ||
|
||
import java.sql.Timestamp; | ||
import java.time.LocalDateTime; | ||
import java.time.format.DateTimeFormatter; | ||
import java.time.format.DateTimeParseException; | ||
|
||
public class DateTimeColumnDef extends ColumnDefWithLength { | ||
private static final DateTimeFormatter DATE_TIME_FORMATTER = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss"); | ||
|
||
private final boolean isTimestamp = getType().equals("timestamp"); | ||
|
||
|
@@ -19,7 +23,24 @@ public static DateTimeColumnDef create(String name, String type, short pos, Long | |
|
||
protected String formatValue(Object value, MaxwellOutputConfig config) throws ColumnDefCastException { | ||
// special case for those broken mysql dates. | ||
if ( value instanceof Long ) { | ||
if ( value instanceof String) { | ||
String dateString = (String) value; | ||
|
||
if ( "0000-00-00 00:00:00".equals(dateString) ) { | ||
if ( config.zeroDatesAsNull ) | ||
return null; | ||
else | ||
return appendFractionalSeconds("0000-00-00 00:00:00", 0, getColumnLength()); | ||
} else { | ||
if ( !DateValidator.isValidDateTime(dateString) ) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sorry for the delay, I'm coming back to this PR since I'm prepping a release of your mariadb stuff and some other stuff too. I'm a little worried about running a regex in a very hot code-path... But I guess this only runs on bootstrapping so it might be ok? can you confirm that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, just making it back to looking at this to see where it was at. From my testing, the value instanceof String is only triggered through the bootstrapping and not during normal processing. It goes down the Long path at least from my debugging I was doing. If it is just bootstrapping are you good with it or should I look for a more efficient way to verify that Date/DateTime instead of using Regex? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hmmm yeah I don't know. I had a performance benchmark setup at some point, let me see if I can maybe measure the impact of the regex on bootstrapping. |
||
return null; | ||
|
||
value = parseDateTime(dateString); | ||
if (value == null) { | ||
return null; | ||
} | ||
} | ||
} else if ( value instanceof Long ) { | ||
Long v = (Long) value; | ||
if ( v == Long.MIN_VALUE || (v == 0L && isTimestamp) ) { | ||
if ( config.zeroDatesAsNull ) | ||
|
@@ -37,4 +58,12 @@ protected String formatValue(Object value, MaxwellOutputConfig config) throws Co | |
throw new ColumnDefCastException(this, value); | ||
} | ||
} | ||
|
||
private Object parseDateTime(String dateString) { | ||
try { | ||
return LocalDateTime.parse(dateString, DATE_TIME_FORMATTER); | ||
} catch (DateTimeParseException e) { | ||
return null; | ||
} | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
package com.zendesk.maxwell.schema.columndef; | ||
|
||
public class DateValidator { | ||
private static final String DATE_TIME_REGEX = | ||
"^\\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])" + | ||
"( (0[0-9]|1[0-9]|2[0-3]):([0-5][0-9]):([0-5][0-9]))?$"; | ||
|
||
public static boolean isValidDateTime(String dateString) { | ||
return dateString.matches(DATE_TIME_REGEX); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mildly, same concerns here -- is this just for bootstrapping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am only seeing bootstrapping taking the String path. Otherwise, during normal processing it takes the Long path.