Open
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
print(pd.DataFrame({'a': 'FooXXXXX,BarXXXXX,BazXXXXX,💾,🤓🤘'.split(','), 'b': 1}))
Issue Description
The example prints the following output:
a b
0 FooXXXXX 1
1 BarXXXXX 1
2 BazXXXXX 1
3 💾 1
4 🤓🤘 1
It seems that some unicode characters are shifting the position to the right.
I have tried with different ranges (number of used bytes), and I can't find where the issue comes from.
Expected Behavior
a b
0 FooXXXXX 1
1 BarXXXXX 1
2 BazXXXXX 1
3 💾 1
4 🤓🤘 1
(well, there's also an alignement problem on github, but the "1" shall be aligned)
Installed Versions
INSTALLED VERSIONS
------------------
commit : 0691c5cf90477d3503834d983f69350f250a6ff7
python : 3.13.2
python-bits : 64
OS : Darwin
OS-release : 22.6.0
Version : Darwin Kernel Version 22.6.0: Thu Apr 24 20:25:14 PDT 2025; root:xnu-8796.141.3.712.2~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.2.3
numpy : 2.2.6
pytz : 2025.2
dateutil : 2.9.0.post0
pip : None
Cython : None
sphinx : None
IPython : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : None
lxml.etree : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : 3.1.5
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : None
pyreadstat : None
pytest : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
tzdata : 2025.2
qtpy : None
pyqt5 : None