Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: reset_index() Fails on a Series/DataFrame Object that has "False" (bool type) as the Index Name #38147

Closed
2 tasks done
Tracked by #7
kmcentush opened this issue Nov 29, 2020 · 2 comments · Fixed by #52741
Closed
2 tasks done
Tracked by #7
Labels
Index Related to the Index class or subclasses Needs Tests Unit test(s) needed to prevent regressions

Comments

@kmcentush
Copy link

kmcentush commented Nov 29, 2020

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • [] (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

import pandas as pd

# Works
df1 = pd.Series(data=range(5, 10), index=range(0, 5))
df1.index.name = 'False'
df1.reset_index()

# Fails
df2 = pd.Series(data=range(5, 10), index=range(0, 5))
df2.index.name = False
df2.reset_index()

# Works
df1 = pd.DataFrame(data=range(5, 10), index=range(0, 5))
df1.index.name = 'False'
df1.reset_index()

# Fails
df2 = pd.DataFrame(data=range(5, 10), index=range(0, 5))
df2.index.name = False
df2.reset_index()

Stacktrace:

--------------------------------------------------------------------------- ValueError Traceback (most recent call last) in 9 df2 = pd.Series(data=range(5, 10), index=range(0, 5)) 10 df2.index.name = False ---> 11 df2.reset_index()

~\Miniconda3\envs\vectorbt\lib\site-packages\pandas\core\series.py in reset_index(self, level, drop, name, inplace)
1340 else:
1341 df = self.to_frame(name)
-> 1342 return df.reset_index(level=level, drop=drop)
1343
1344 # ----------------------------------------------------------------------

~\Miniconda3\envs\vectorbt\lib\site-packages\pandas\core\frame.py in reset_index(self, level, drop, inplace, col_level, col_fill)
4602 # to ndarray and maybe infer different dtype
4603 level_values = _maybe_casted_values(lev, lab)
-> 4604 new_obj.insert(0, name, level_values)
4605
4606 new_obj.index = new_index

~\Miniconda3\envs\vectorbt\lib\site-packages\pandas\core\frame.py in insert(self, loc, column, value, allow_duplicates)
3494 self._ensure_valid_index(value)
3495 value = self._sanitize_column(column, value, broadcast=False)
-> 3496 self._data.insert(loc, column, value, allow_duplicates=allow_duplicates)
3497
3498 def assign(self, **kwargs) -> "DataFrame":

~\Miniconda3\envs\vectorbt\lib\site-packages\pandas\core\internals\managers.py in insert(self, loc, item, value, allow_duplicates)
1171 if not allow_duplicates and item in self.items:
1172 # Should this be a different kind of error??
-> 1173 raise ValueError(f"cannot insert {item}, already exists")
1174
1175 if not isinstance(loc, int):

ValueError: cannot insert False, already exists

Problem description

Pandas index names support multiple object types as opposed to just str. I presume that the intention was that an index name is never anything other than a string, but I can only speculate. Interestingly enough, I did not encounter this issue on a DataFrame. This leads me to believe that the issue has to do with

I originally encountered this bug when loading data from a csv that had a header of False. Perhaps a clean enough work-around is to convert headers to strings when loading from any file type?

Expected Output

The index reset without the ValueError. It should match df1's output.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : None python : 3.8.5.final.0 python-bits : 64 OS : Windows OS-release : 10 machine : AMD64 processor : Intel64 Family 6 Model 142 Stepping 10, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : English_United States.1252

pandas : 1.0.5
numpy : 1.19.1
pytz : 2020.1
dateutil : 2.8.1
pip : 20.1.1
setuptools : 49.2.0.post20200712
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.16.1
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.3.0
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pytest : None
pyxlsb : None
s3fs : None
scipy : 1.5.2
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
numba : 0.50.1

@kmcentush kmcentush added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 29, 2020
@phofl
Copy link
Member

phofl commented Nov 29, 2020

Hi,

thanks for your report. The problem lies in insert. Your DataFrame has a column 0. Unfortunately False == 0 evaluates to True. I do not know if this is trivial to fix.

@phofl phofl added Index Related to the Index class or subclasses and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 29, 2020
@simonjayhawkins simonjayhawkins added this to the Contributions Welcome milestone Dec 14, 2020
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@phofl
Copy link
Member

phofl commented Apr 17, 2023

This works now, may need tests

@phofl phofl added Needs Tests Unit test(s) needed to prevent regressions and removed Bug labels Apr 17, 2023
vagechirkov added a commit to vagechirkov/pandas that referenced this issue Apr 18, 2023
vagechirkov added a commit to vagechirkov/pandas that referenced this issue Apr 18, 2023
mroeschke pushed a commit that referenced this issue Apr 20, 2023
* add regression test to cover an issue #38147

* add assert to the tests

* replace Index with RangeIndex
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Index Related to the Index class or subclasses Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants