Skip to content

Commit

Permalink
format-dates: Return fully masked dates for empty date fields
Browse files Browse the repository at this point in the history
If a date field is empty, it should be completely masked to indicate
that the date is unknown.

Resolves <#1507>
  • Loading branch information
joverlee521 committed Jul 1, 2024
1 parent ea69f00 commit 06030e8
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 8 deletions.
12 changes: 11 additions & 1 deletion augur/curate/format_dates.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,10 @@ def format_date(date_string, expected_formats):
>>> expected_formats = ['%Y', '%Y-%m', '%Y-%m-%d', '%Y-%m-%dT%H:%M:%SZ', '%m-%d']
>>> format_date("", expected_formats)
'XXXX-XX-XX'
>>> format_date(" ", expected_formats)
'XXXX-XX-XX'
>>> format_date("01-01", expected_formats)
'XXXX-XX-XX'
>>> format_date("2020", expected_formats)
Expand All @@ -133,6 +137,10 @@ def format_date(date_string, expected_formats):
'2020-01-15'
"""

date_string = date_string.strip()
if date_string == '':
return 'XXXX-XX-XX'

for date_format in expected_formats:
try:
parsed_date = datetime.strptime(date_string, date_format)
Expand Down Expand Up @@ -180,7 +188,9 @@ def run(args, records):
for field in args.date_fields:
date_string = record.get(field)

if not date_string:
# TODO: This should raise an error if the expected date field does
# not exist in the the record
if date_string is None:
continue

formatted_date_string = format_date(date_string, args.expected_date_formats)
Expand Down
11 changes: 4 additions & 7 deletions tests/functional/curate/cram/format-dates/empty-date-field.t
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,16 @@ Setup

$ export AUGUR="${AUGUR:-$TESTDIR/../../../../../bin/augur}"

Test empty date value.
This currently has the unexpected behavior of returning the empty string.
Test empty date value, which should be returned as a fully masked date.

$ echo '{"record": 1, "date": ""}' \
> | ${AUGUR} curate format-dates \
> --date-fields "date"
{"record": 1, "date": ""}
{"record": 1, "date": "XXXX-XX-XX"}

Test whitespace only date value.
This currently raises an error.
Test whitespace only date value, which should be returned as a fully masked date.

$ echo '{"record": 1, "date": " "}' \
> | ${AUGUR} curate format-dates \
> --date-fields "date"
ERROR: Unable to format date string ' ' in field 'date' of record 0.
[2]
{"record": 1, "date": "XXXX-XX-XX"}

0 comments on commit 06030e8

Please sign in to comment.