Description
Considering this function of DateTime
class:
intelmq/intelmq/lib/harmonization.py
Lines 421 to 436 in 122e3a3
In particular the line:
value = value.astimezone(pytz.utc)
This causes that naive time strings (without timezone information) are being considered as being in a timezone of the local computer. My computer is set to a UTC +01:00 timezone and this is the result:
DateTime.sanitize("2021-01-17 07:44:46") # this calls DateTime.__parse
'2021-01-17T06:44:46+00:00'
Because the parsed timestamps are usually produced by feeds outside of the local computer I believe it makes much more sense to assume they are produced in UTC rather than processing them based on whatever is the local timezone of the IntelMQ installation.
On a side note: There are several cases of this kind of code:
if not value.tzinfo and sys.version_info <= (3, 6):
value = pytz.utc.localize(value)
elif not value.tzinfo:
value = value.astimezone(pytz.utc)
Such expressions are not equal in what they do. The first one (pytz.utc.localize
) merely attaches a time zone object to a datetime without adjustment of date and time data. The second one (astimezone(pytz.utc)
) adjusts time and date based on local time (docs). Example (with the computer in UTC +01:00):
value = datetime.datetime.strptime("2021-01-17 07:44:46", "%Y-%m-%d %H:%M:%S")
value.isoformat()
'2021-01-17T07:44:46'
pytz.utc.localize(value).isoformat()
'2021-01-17T07:44:46+00:00'
value.astimezone(pytz.utc).isoformat()
'2021-01-17T06:44:46+00:00'
I think this is also the reason why the following test works only when the machine running the test is in UTC timezone:
intelmq/intelmq/tests/lib/test_harmonization.py
Lines 252 to 262 in 122e3a3
PROPOSED SOLUTION:
I think the better way of handling of naive time strings is to assume they are in UTC rather than local timezone of IntelMQ installation:
if not value.tzinfo:
value = value.replace(tzinfo=datetime.timezone.utc)
else:
value = value.astimezone(tz=datetime.timezone.utc)
The function value.replace(tzinfo=datetime.timezone.utc)
does effectively the same thing as pytz.utc.localize(value)
: it attaches a time zone object to a datetime without adjustment of date and time data. And without needing pytz as dependency.
Quick tests showed that my solution seems to work and passes all the tests including the one previously skipped.
Activity