Skip to content

ENH: Split assert_frame_equal's check_exact flag for floats and integers #54861

Open
@kcwerther

Description

@kcwerther

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Consider the following 3 dataframes:
reference_df = pd.DataFrame([{"id": 8000000, "value": 0.000000123}])
correct_df = pd.DataFrame([{"id": 8000000, "value": 0.000000124}])
incorrect_df = pd.DataFrame([{"id": 8000001, "value": 0.000000123}])

I would like to be able to use assert_frame_equal with the following results:
assert_frame_equal(reference_df, correct_df) passes
assert_frame_equal(reference_df, incorrect_df) fails

Feature Description

I would like to split out the check_exact flag so that check_exact can be independently set for floats and integers, with separate tolerance values for each. The new flags would be:
check_exact_float=False
rtol_float=1e-5
atol_float=1e-5
check_exact_int=False
rtol_int=1e-5
atol_int=1

With these new flags, the above problem would be solved with the following:
assert_frame_equal(reference_df, correct_df, check_exact_int=True) passes
assert_frame_equal(reference_df, incorrect_df, check_exact_int=True) fails

Alternative Solutions

I'm not aware of any existing code that would do this, but one can convert the integers into strings to allow exact integers but not exact floats, as shown below.

def compare_exact_ints_non_exact_floats(df1, df2):
    df1_workaround = df1.astype({col: "string" for col in df1.select_dtypes("int").columns.to_list()})
    if str(df1_workaround.index.dtype).startswith("int"):
        df1_workaround.index = df1_workaround.index.astype("string")

    df2_workaround = df2.astype({col: "string" for col in df2.select_dtypes("int").columns.to_list()})
    if str(df2_workaround.index.dtype).startswith("int"):
        df2_workaround.index = df2_workaround.index.astype("string")

    pd.testing.assert_frame_equal(df1_workaround, df2_workaround)

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNeeds DiscussionRequires discussion from core team before further actionTestingpandas testing functions or related to the test suite

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions