Description
pandas.read_gbq
will silently cast float values over 10k to integeres causing precision lost
example of the issue
In[1]:
import pandas as pd
import numpy as np
df=pd.read_gbq('''select * from
(select pi(),9999.9,10000.1,200000.2),
(select 2.72,9999.9999999,10000.1,300000.3)''',
project_id='CHANGE-ME')
Actual Output
In[2]: print(df)
Out[2]:
f0_ f1_ f2_ f3_
0 3.141593 9999.9 10000 200000
1 2.720000 10000.0 10000 300000
In[3]: df.dtypes
Out[3]:
f0_ float64
f1_ float64
f2_ int64
f3_ int64
dtype: object
Expected Output
In[2]: print(df)
Out[2]:
f0_ f1_ f2_ f3_
0 3.141593 9999.9 10000.1 200000.2
1 2.720000 10000.0 10000.1 300000.3
In[3]: df.dtypes
Out[3]:
f0_ float64
f1_ float64
f2_ float64
f3_ float64
dtype: object
Output of pd.show_versions()
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-36-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.19.0rc1+33.g7dedbed
nose: 1.3.7
pip: 8.1.2
setuptools: 25.1.6
Cython: 0.24.1
numpy: 1.11.1
scipy: None
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: 1.4.6
patsy: None
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.3
openpyxl: 2.4.0
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: 3.6.4
bs4: 4.5.1
html5lib: 0.999999999
httplib2: 0.9.2
apiclient: 1.5.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None