Skip to content

PERF: enable caching on expensive offsets #17914

Closed
@chris-b1

Description

@chris-b1

xref #16463

We have a set of seemingly unused but also seemingly functional caching logic for generating ranges of offsets. Example

In [63]: BM = pd.offsets.BusinessMonthEnd()

In [64]: %timeit pd.date_range('1990-01-01', '2020-01-01', freq=BM)
18.1 ms ± 366 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [65]: BM._cacheable
Out[65]: False

In [66]: BM._cacheable = True

In [67]: %timeit pd.date_range('1990-01-01', '2020-01-01', freq=BM)
686 µs ± 17.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [68]: a = pd.date_range('1990-01-01', '2020-01-01', freq='BM')

In [69]: b = pd.date_range('1990-01-01', '2020-01-01', freq=BM)

In [70]: (a == b).all()
Out[70]: True

Possibly should enable or expose an API for this? Especially for slower offsets.

cc @jbrockmendel

Metadata

Metadata

Assignees

No one assigned

    Labels

    DatetimeDatetime data dtypeFrequencyDateOffsetsPerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions