Skip to content

Error with CategoricalIndex When Updating Pandas #203

@MicahGale

Description

@MicahGale

With a recent update to pandas the CategoricalIndex error detection no longer works.

MWE:

import numpy as np
import pandas as pd
import tableone as tb

data = { 
    'Age': [35, 42, 30, 29, 51, 38],  
    'Gender': [1, 0, 1, 0, 0, 1],  
    'Income': [45000, np.nan, 65000, 48000, 70000, np.nan],  
    'Education': ['2', '1', '1', '', '1', '2'],  
    'Satisfaction': [4.5, 3.2, np.nan, 4.8, 3.9, 4.1]  
}
data = pd.DataFrame(data)

cat_col = ['Gender', 'Education']
for col in cat_col:
    data[col] = data[col].astype("category")
group = 'Gender'

data_table1 = tb.TableOne(data, categorical = cat_col, continuous=["Satisfaction"], 
    groupby = group, pval= True, htest_name=True,  decimals=3)

print(data_table1)

This leads to:

/home/mgale/dev/tableone/tableone/preprocessors.py:87: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  groupbylvls = sorted(data.groupby(groupby).groups.keys())  # type: ignore
/home/mgale/dev/tableone/tableone/tables.py:399: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior
  df_cont = pd.pivot_table(cont_data, columns=[groupby], aggfunc=aggfuncs)
/home/mgale/dev/tableone/tableone/tables.py:399: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior
  df_cont = pd.pivot_table(cont_data, columns=[groupby], aggfunc=aggfuncs)
/home/mgale/dev/tableone/tableone/tables.py:399: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior
  df_cont = pd.pivot_table(cont_data, columns=[groupby], aggfunc=aggfuncs)
/home/mgale/dev/tableone/tableone/tables.py:399: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior
  df_cont = pd.pivot_table(cont_data, columns=[groupby], aggfunc=aggfuncs)
/home/mgale/dev/tableone/tableone/tables.py:399: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior
  df_cont = pd.pivot_table(cont_data, columns=[groupby], aggfunc=aggfuncs)
/home/mgale/dev/tableone/tableone/tables.py:399: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior
  df_cont = pd.pivot_table(cont_data, columns=[groupby], aggfunc=aggfuncs)
/home/mgale/dev/tableone/tableone/tables.py:399: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior
  df_cont = pd.pivot_table(cont_data, columns=[groupby], aggfunc=aggfuncs)
/home/mgale/dev/tableone/tableone/tables.py:399: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior
  df_cont = pd.pivot_table(cont_data, columns=[groupby], aggfunc=aggfuncs)
/home/mgale/dev/tableone/tableone/tables.py:399: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior
  df_cont = pd.pivot_table(cont_data, columns=[groupby], aggfunc=aggfuncs)
Traceback (most recent call last):
  File "/home/mgale/miniforge3/envs/data/lib/python3.13/site-packages/pandas/core/indexes/base.py", line 3812, in get_loc
    return self._engine.get_loc(casted_key)
           ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
  File "pandas/_libs/index.pyx", line 167, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 175, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index_class_helper.pxi", line 70, in pandas._libs.index.Int64Engine._check_type
KeyError: slice(None, None, None)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/mgale/dev/tableone/test.py", line 19, in <module>
    data_table1 = tb.TableOne(data, categorical = cat_col, continuous=["Satisfaction"],
        groupby = group, pval= True, htest_name=True,  decimals=3)
  File "/home/mgale/dev/tableone/tableone/tableone.py", line 277, in __init__
    self.create_intermediate_tables(data)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/home/mgale/dev/tableone/tableone/tableone.py", line 457, in create_intermediate_tables
    self.cont_table = self.tables.create_cont_table(data,
                      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
                                                    self._overall,
                                                    ^^^^^^^^^^^^^^
    ...<7 lines>...
                                                    self.smd_table,
                                                    ^^^^^^^^^^^^^^^
                                                    self._groupby)
                                                    ^^^^^^^^^^^^^^
  File "/home/mgale/dev/tableone/tableone/tables.py", line 443, in create_cont_table
    table = table.join(nulltable)
  File "/home/mgale/miniforge3/envs/data/lib/python3.13/site-packages/pandas/core/frame.py", line 10784, in join
    return merge(
        self,
    ...<7 lines>...
        validate=validate,
    )
  File "/home/mgale/miniforge3/envs/data/lib/python3.13/site-packages/pandas/core/reshape/merge.py", line 184, in merge
    return op.get_result(copy=copy)
           ~~~~~~~~~~~~~^^^^^^^^^^^
  File "/home/mgale/miniforge3/envs/data/lib/python3.13/site-packages/pandas/core/reshape/merge.py", line 888, in get_result
    result = self._reindex_and_concat(
        join_index, left_indexer, right_indexer, copy=copy
    )
  File "/home/mgale/miniforge3/envs/data/lib/python3.13/site-packages/pandas/core/reshape/merge.py", line 837, in _reindex_and_concat
    left = self.left[:]
           ~~~~~~~~~^^^
  File "/home/mgale/miniforge3/envs/data/lib/python3.13/site-packages/pandas/core/frame.py", line 4086, in __getitem__
    and key in self.columns
        ^^^^^^^^^^^^^^^^^^^
  File "/home/mgale/miniforge3/envs/data/lib/python3.13/site-packages/pandas/core/indexes/category.py", line 368, in __contains__
    return contains(self, key, container=self._engine)
  File "/home/mgale/miniforge3/envs/data/lib/python3.13/site-packages/pandas/core/arrays/categorical.py", line 230, in contains
    loc = cat.categories.get_loc(key)
  File "/home/mgale/miniforge3/envs/data/lib/python3.13/site-packages/pandas/core/indexes/base.py", line 3818, in get_loc
    raise InvalidIndexError(key)
pandas.errors.InvalidIndexError: slice(None, None, None)

This seems that the TypeError expected around line 443 changed to a new type.

This is with versions:

  • python: 3.13.11
  • pandas: 2.3.3
  • tableone: 0.9.5

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions