Pass datetime format explicitly when parsing #34
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When I get any data, I get warnings from pandas about implicit parsing of datetimes. This makes me think I've done something wrong. But after checking, it seems I haven't.
Calling
pd.to_datetime
without passing an explicit datetime format string is risky. e.g. maybe it works for one person, but someone else has a different locale or timezone or whatever. (e.g. I'm running code in France, with a French locale.) So pandas may try to parse with YYYY/DD/MM or something crazy, and the user might not notice. (I'm thinking of cases like static tables, or small files with only 1 row.)AEMO data comes with two timestamp formats. Most are
"%Y/%m/%d %H:%M:%S"
. Some are `"%Y/%m/%d %H:%M:%S.%f" (new 5MS bidding stuff).I added a test for this. I confirmed that the test fails with the old code, and passes with the new code. But I have not run all tests, to confirm that I didn't break anything (because of #16).
I manually confirmed that the datetime columns are still datetimes.
I also tidied up the exception handling.
except Exception
will catch things like aKeyboardInterrupt
. So you try to halt your code (e.g. click the stop icon in Jupyter) but it doesn't stop. Catching precisely the type of exception that will be thrown is better.There also were some other unit tests that failed for me. I modified the assertions so the failure error is clearer. (They still fail, because of a hard-coded path in
defaults.py
that doesn't exist on my machine. Perhaps we should find a way to generalise that?)