Skip to content

BUG: MultiIndex.unique incorrect when NA value is present #41823

Closed
@jbrockmendel

Description

@jbrockmendel

drop_duplicates looks right, while unique (and _get_unique_index) keep a duplicate entry containing an NA value. Example based on test_intersection_with_missing_values_on_both_sides. The analogous problem crops up for every entry in nulls_fixture

nulls_fixture = np.nan
mi1 = pd.MultiIndex.from_arrays([[3, nulls_fixture, 4, nulls_fixture], [1, 2, 4, 2]])

>>> mi1.unique()
MultiIndex([(3.0, 1),
            (nan, 2),
            (4.0, 4),
            (nan, 2)],
           )

>>> mi1.drop_duplicates()
MultiIndex([(3.0, 1),
            (nan, 2),
            (4.0, 4)],
           )

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions