Closed
Description
Performance of Series.xs()
when getting values at specified index and the first level of a large dataset seems to have regressed significantly, after pandas was upgraded to v1.0.5 from v0.25.3. So is DataFrame.xs()
.
code:
import numpy as np
import pandas as pd
n=1e4
data = pd.Series(np.arange(n**2), index=pd.MultiIndex.from_product([np.arange(n), -np.arange(n)]))
%timeit data.xs(100, level=0) #first level
%timeit data.xs(-100, level=1) #second level
outcome
## 0.25.3
%timeit data.xs(100, level=0)
571 µs ± 1.79 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit data.xs(-100, level=1)
193 ms ± 947 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
## 1.0.5
%timeit data.xs(100, level=0)
683 ms ± 3.35 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit data.xs(-100, level=1)
197 ms ± 2.42 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)