Skip to content

pd.testing.assert_frame_equal doesn't do precision according to the doc #25068

Closed
@stas00

Description

@stas00

Code Sample, a copy-pastable example if possible

import pandas as pd
import pandas.testing
df1 = pd.DataFrame([0.00016,                -0.154526,            -0.20580199999999998])
df2 = pd.DataFrame([0.00015981824253685772, -0.15452557802200317, -0.20580188930034637])
pd.testing.assert_frame_equal(df1, df2, check_exact=False, check_less_precise=3)

Problem description

This asserts, despite all columns being identical in the first 3 digits after the decimal point.

AssertionError: DataFrame.iloc[:, 0] are different

DataFrame.iloc[:, 0] values are different (33.33333 %)
[left]:  [0.00016, -0.154526, -0.20580199999999998]
[right]: [0.00015981824253685772, -0.15452557802200317, -0.20580188930034637]

It doesn't assert if check_less_precise=2 is used instead. So something is not right here. Is there some kind of a rounding issue here?

Doc:

check_less_precise : bool or int, default False

Specify comparison precision. Only used when check_exact is False.
5 digits (False) or 3 digits (True) after decimal points are compared.
If int, then specify the digits to compare

I understand the doc says check_less_precise defines how many digits after the decimal point are compared.

Unrelated: The doc should probably say "decimal point" (singular) as there is only one, no? and "specify the digits to compare" is vague, perhaps "In int, then specify how many digits after decimal point to compare"?

Here is a proposed updated doc entry:

Specify comparison precision. Only used when check_exact is False. int: How many digits after the decimal point to compare, False: 5 digits, True: 3 digits.

Expected Output

no assert for up to check_less_precise=4 in this example, the numbers start to diverge at digit 5.

and it's still unclear whether rounding is performed or not.

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit: None
python: 3.7.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-43-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_CA.UTF-8
LANG: en_CA.UTF-8
LOCALE: en_CA.UTF-8

pandas: 0.24.0
pytest: 4.0.2
pip: 19.0.1
setuptools: 40.6.3
Cython: 0.29.2
numpy: 1.15.4
scipy: 1.2.0
pyarrow: None
xarray: None
IPython: 7.2.0
sphinx: None
patsy: None
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: 1.2.1
tables: None
numexpr: 2.6.9
feather: None
matplotlib: 3.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: 4.2.5
bs4: 4.7.1
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Closing CandidateMay be closeable, needs more eyeballsTestingpandas testing functions or related to the test suite

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions