Skip to content

Conversation

@jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Nov 18, 2023

Perf impact is pretty negligible compared to the cost of going through dateutil:

import pandas as pd
import numpy as np

dtstr = "2016/01/02 03:04:05.001000 UTC"

%timeit pd.Timestamp(dtstr)
100 µs ± 3.05 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)  # <- PR
109 µs ± 10.1 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)  # <- main

vals = np.array([dtstr] * 10**5, dtype=object)
%timeit pd.to_datetime(vals)
8.37 ms ± 134 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # <- PR
9.1 ms ± 597 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # <- main

%timeit pd.to_datetime(vals, format="mixed")
8.29 ms ± 236 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # <- PR
8.13 ms ± 63.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # <- main

return ret


cdef object _reso_pattern = re.compile(r"\d:\d{2}:\d{2}\.(?P<frac>\d+)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the first \ds always guaranteed to be separate by :?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dateutil supports some really weird formats (@MarcoGorelli and i have discussed moving away from using it at all) so i dont know. but i think this covers the vast majority of cases we care about

@mroeschke mroeschke added the Non-Nano datetime64/timedelta64 with non-nanosecond resolution label Nov 19, 2023
@mroeschke mroeschke added this to the 2.2 milestone Nov 20, 2023
@mroeschke mroeschke merged commit 92fa9ca into pandas-dev:main Nov 20, 2023
@mroeschke
Copy link
Member

Thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the bug-ts-unit branch November 20, 2023 17:56
phofl pushed a commit to phofl/pandas that referenced this pull request Nov 21, 2023
* BUG: nanoseconds and reso in dateutil paths

* GH ref
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Non-Nano datetime64/timedelta64 with non-nanosecond resolution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: inferred Timestamp unit with dateutil paths

2 participants