BUG: support decimal keyword for Float64Dtype in read_csv #52086
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
import io
pd.read_csv(io.StringIO('id\n"1,5"\n'), dtype={'id':pd.Float64Dtype()}, sep=';', decimal=',')
or
import pandas as pd
pd.read_csv("./test.csv", dtype={'id':pd.Float64Dtype()}, sep=';', decimal=',')
where test.csv looks like
id
1,5
Issue Description
I have a semicolon-separated CSV file with a bunch of floats that have decimal separation with comma.
For that setup, when I specify dtype
for that column as Float64Dtype()
, it fails.
When I specify float
from python, it works.
When I use and specify .
as a decimal separator, it works.
Expected Behavior
The listed example should parse and result in a float 1.5
Installed Versions
/home/robbert/.local/lib/python3.10/site-packages/_distutils_hack/init.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
INSTALLED VERSIONS
commit : 2e218d1
python : 3.10.6.final.0
python-bits : 64
OS : Linux
OS-release : 5.19.0-35-generic
Version : #36~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Feb 17 15:17:25 UTC 2
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.5.3
numpy : 1.23.5
pytz : 2022.1
dateutil : 2.8.1
setuptools : 65.5.1
pip : 23.0.1
Cython : None
pytest : 7.2.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 1.1
pymysql : None
psycopg2 : 2.9.5
jinja2 : 3.1.2
IPython : 8.6.0
pandas_datareader: None
bs4 : 4.11.1
bottleneck : None
brotli : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : 3.1.1
pandas_gbq : None
pyarrow : 10.0.1
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
snappy : None
sqlalchemy : 1.4.46
tables : None
tabulate : 0.9.0
xarray : None
xlrd : None
xlwt : None
zstandard : None
tzdata : 2022.7