Skip to content

BUG: regression, is_unique is incorrect since pandas 2.1.0 #57911

Closed
@morotti

Description

@morotti

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
print("pandas version", pd.__version__)

values = [1, 1, 2, 3, 4]
index = pd.Index(values)

print("===========")
print(index)
print("is_unique=", index.is_unique)

filtered_index = index[2:].copy()

print("===========")
print(filtered_index)
print("is_unique=", filtered_index.is_unique)


index = pd.Index(values)
filtered_index = index[2:].copy()

print("===========")
print(filtered_index)
print("is_unique=", filtered_index.is_unique)

Issue Description

Hello,

We found a regression, index.is_unique is incorrect since pandas 2.1.0.

I looked for open issues but did not find any fix or existing discussion.
Having a look at the changelog, there were lots of changes in 2.1.0 to introduce copy-on-write optimizations on the index.
I think the issue could be related to that, my best guess, maybe index[2:] cached something from the original index that is no longer correct?

Attaching a simple repro, it's very easy to reproduce. :)

Thank you.

Expected Behavior

pandas version 1.5.3
===========
Int64Index([1, 1, 2, 3, 4], dtype='int64')
is_unique= False
===========
Int64Index([2, 3, 4], dtype='int64')
is_unique= True
===========
Int64Index([2, 3, 4], dtype='int64')
is_unique= True
pandas version 2.2.1
===========
Index([1, 1, 2, 3, 4], dtype='int64')
is_unique= False
===========
Index([2, 3, 4], dtype='int64')
is_unique= False    # <---------------- INCORRECT
===========
Index([2, 3, 4], dtype='int64')
is_unique= True

Installed Versions

tested on:

  • pandas 1.5.3: PASS
  • pandas 2.0.0: PASS
  • pandas 2.0.3: PASS
  • pandas 2.1.0: INCORRECT
  • pandas 2.1.4: INCORRECT
  • pandas 2.2.1 (latest): INCORRECT

Metadata

Metadata

Assignees

Labels

BugIndexRelated to the Index class or subclassesRegressionFunctionality that used to work in a prior pandas version

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions