-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF/CLN: see what we can use from offsets.pyx #11214
Comments
cc @chris-b1 |
Also looks nice place to put some cythonized codes from |
I didn't find a whole lot useful in the existing code, it's a slightly different definition of an offset, basically stored as a day ordinal from what I can tell. I did scratch out what the current But it'd take quite a bit of work to actually port everything over in a feature-complete way, and would have worry about things like pickle compat. So maybe makes more sense to define some targeted helper cython functions and attach to the existing classes or something? In [1]: from pandas.tseries.offsets import DateOffset, DateOffset2
...: old_do = DateOffset(months=1)
...: new_do = DateOffset2(months=1)
In [2]: ts = pd.Timestamp('2014-1-1')
In [3]: %%timeit
...: for _ in xrange(10000):
...: old_do.apply(ts)
10 loops, best of 3: 143 ms per loop
In [4]: %%timeit
...: for _ in xrange(10000):
...: new_do.apply(ts)
100 loops, best of 3: 13.5 ms per loop
In [5]: dti = pd.date_range('1900-1-1', periods=10000)
In [6]: %timeit old_do.apply_index(dti)
1000 loops, best of 3: 1 ms per loop
In [7]: %timeit new_do.apply_index(dti)
1000 loops, best of 3: 1.02 ms per loop |
@chris-b1 right, your helpers could be used in cases when advancing more than 1 date rather than doing it in a python loop. so will repurpose this issue. if you have a chance can you put up a list of things that should be targeted and I will put up checkboxes. |
Pretty sure this can be closed; the referenced file is gone. |
@jbrockmendel Can you point to the commit where it was removed/moved? |
xref #11205
https://github.com/pydata/pandas/blob/master/pandas/src/offsets.pyx
is in the repo but is not included in the build, nor updated in 2+ years.
Looks like their might be some cython code to speed up
DateOffset
apply in the non-vectorized cases.Further might look to move some routines from
tslib.pyx
for these types of things (and obviously would need to be included in the build).The text was updated successfully, but these errors were encountered: