Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import numpy as np, pandas as pd
rs = np.random.RandomState(0)
raw = pd.DataFrame(
rs.randint(1000, size=(10, 8)), columns=["col" + str(i + 1) for i in range(8)]
)
raw.index = pd.date_range("2020-1-1", periods=10)
raw.columns = pd.date_range("2020-3-1", periods=8)
print(raw)
"""result is
2020-03-01 2020-03-02 2020-03-03 2020-03-04 2020-03-05 2020-03-06 2020-03-07 2020-03-08
2020-01-01 684 559 629 192 835 763 707 359
2020-01-02 9 723 277 754 804 599 70 472
2020-01-03 600 396 314 705 486 551 87 174
2020-01-04 600 849 677 537 845 72 777 916
2020-01-05 115 976 755 709 847 431 448 850
2020-01-06 99 984 177 755 797 659 147 910
2020-01-07 423 288 961 265 697 639 544 543
2020-01-08 714 244 151 675 510 459 882 183
2020-01-09 28 802 128 128 932 53 901 550
2020-01-10 488 756 273 335 388 617 42 442
"""
print(raw.shift(periods=2, freq="D", axis=1))
"""result is
2020-03-01 2020-03-02 2020-03-03 2020-03-04 2020-03-05 2020-03-06 2020-03-07 2020-03-08
2020-01-01 NaN NaN 684 559 629 192 835 763
2020-01-02 NaN NaN 9 723 277 754 804 599
2020-01-03 NaN NaN 600 396 314 705 486 551
2020-01-04 NaN NaN 600 849 677 537 845 72
2020-01-05 NaN NaN 115 976 755 709 847 431
2020-01-06 NaN NaN 99 984 177 755 797 659
2020-01-07 NaN NaN 423 288 961 265 697 639
2020-01-08 NaN NaN 714 244 151 675 510 459
2020-01-09 NaN NaN 28 802 128 128 932 53
2020-01-10 NaN NaN 488 756 273 335 388 617
"""
print(raw.shift(periods=2, freq="D", axis=1, fill_value=0))
"""result is
2020-03-03 2020-03-04 2020-03-05 2020-03-06 2020-03-07 2020-03-08 2020-03-09 2020-03-10
2020-01-01 684 559 629 192 835 763 707 359
2020-01-02 9 723 277 754 804 599 70 472
2020-01-03 600 396 314 705 486 551 87 174
2020-01-04 600 849 677 537 845 72 777 916
2020-01-05 115 976 755 709 847 431 448 850
2020-01-06 99 984 177 755 797 659 147 910
2020-01-07 423 288 961 265 697 639 544 543
2020-01-08 714 244 151 675 510 459 882 183
2020-01-09 28 802 128 128 932 53 901 550
2020-01-10 488 756 273 335 388 617 42 442
"""
Issue Description
As is defined in doc, when freq
is specified, only values in axes are changed, while data are kept. When axis=0
, things work well. However, when axis=1
, data are shifted as if freq
is not specified. The strange thing is that when fill_value
is supplied with a value, freq
argument worked. This behavior is not documented in https://pandas.pydata.org/docs/dev/reference/api/pandas.DataFrame.shift.html.
Expected Behavior
shift
shows the same behavior for both axis
values whenever fill_value
is supplied.
As the change might break existing behaviors, it is better to correct this behavior since 1.5.0.
Installed Versions
INSTALLED VERSIONS
commit : 4bfe3d0
python : 3.9.12.final.0
python-bits : 64
OS : Darwin
OS-release : 21.4.0
Version : Darwin Kernel Version 21.4.0: Fri Mar 18 00:45:05 PDT 2022; root:xnu-8020.101.4~15/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : zh_CN.UTF-8
pandas : 1.4.2
numpy : 1.21.2
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.4
setuptools : 61.2.0
Cython : 0.29.24
pytest : 6.2.4
hypothesis : 6.46.5
sphinx : 4.4.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.3
IPython : 8.1.1
pandas_datareader: None
bs4 : 4.10.0
bottleneck : 1.3.4
brotli :
fastparquet : None
fsspec : None
gcsfs : None
markupsafe : 2.0.1
matplotlib : 3.5.0
numba : None
numexpr : 2.7.3
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 8.0.0
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.7.1
snappy : None
sqlalchemy : 1.4.27
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
zstandard : None