Skip to content

SparseDataFrame loc raises InvalidIndexError when repeated keys are passed #24270

Closed
@vindex10

Description

@vindex10

Code Sample, a copy-pastable example if possible

tmp = pd.SparseDataFrame([[0]*330]*447)
tmp.loc[[0]*61]
Traceback:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/vindex10/downloads/pand/pandas/pandas/core/base.py", line 77, in __repr__
    return str(self)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/base.py", line 56, in __str__
    return self.__unicode__()
  File "/home/vindex10/downloads/pand/pandas/pandas/core/frame.py", line 628, in __unicode__
    line_width=width, show_dimensions=show_dimensions)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/frame.py", line 708, in to_string
    line_width=line_width)
  File "/home/vindex10/downloads/pand/pandas/pandas/io/formats/format.py", line 428, in __init__
    self._chk_truncate()
  File "/home/vindex10/downloads/pand/pandas/pandas/io/formats/format.py", line 497, in _chk_truncate
    frame = concat((frame.iloc[:row_num, :],
  File "/home/vindex10/downloads/pand/pandas/pandas/core/indexing.py", line 1490, in __getitem__
    return self._getitem_tuple(key)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/indexing.py", line 2155, in _getitem_tuple
    retval = getattr(retval, self.name)._getitem_axis(key, axis=axis)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/indexing.py", line 2206, in _getitem_axis
    return self._get_slice_axis(key, axis=axis)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/indexing.py", line 2176, in _get_slice_axis
    return self._slice(slice_obj, axis=axis, kind='iloc')
  File "/home/vindex10/downloads/pand/pandas/pandas/core/indexing.py", line 151, in _slice
    return self.obj._slice(obj, axis=axis, kind=kind)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/sparse/frame.py", line 528, in _slice
    return self.reindex(index=new_index, columns=new_columns)
  File "/home/vindex10/downloads/pand/pandas/pandas/util/_decorators.py", line 186, in wrapper
    return func(*args, **kwargs)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/frame.py", line 3641, in reindex
    return super(DataFrame, self).reindex(**kwargs)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/generic.py", line 4331, in reindex
    fill_value, copy).__finalize__(self)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/frame.py", line 3573, in _reindex_axes
    fill_value, limit, tolerance)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/sparse/frame.py", line 679, in _reindex_index
    indexer = self.index.get_indexer(index, method, limit=limit)
  File "/home/vindex10/downloads/pand/pandas/pandas/core/indexes/base.py", line 2680, in get_indexer
    raise InvalidIndexError('Reindexing only valid with uniquely'
pandas.core.indexes.base.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Problem description

When asking for 60 rows it works:

tmp.loc[[0]*60]

And also works with 61 and regular DataFrame. So the behavior is definitely unexpectable.

Expected Output

SparseDataFrame with 61 row.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: c037128
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 4.14.83-gentoo
machine: x86_64
processor: Intel(R) Core(TM) i5-4460 CPU @ 3.20GHz
byteorder: little
LC_ALL: en_US.UTF8
LANG: en_US.utf8
LOCALE: en_US.UTF-8

pandas: 0.24.0.dev0+1274.gc0371288f
pytest: None
pip: 18.1
setuptools: 39.0.1
Cython: 0.29.1
numpy: 1.15.4
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIndexingRelated to indexing on series/frames, not to indexes themselvesSparseSparse Data Type

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions