Description
Code Sample, a copy-pastable example if possible
This is the relevant code extract from pandas source:
if isinstance(axis, (tuple, list)):
result = self
for ax in axis:
result = result.dropna(how=how, thresh=thresh, subset=subset, axis=ax)
This is the output from the function call:
In [7]: df
Out[7]:
A B C D
0 NaN 2.0 NaN 0
1 3.0 4.0 NaN 1
2 NaN NaN NaN 5
In [8]: df.dropna(axis=[0, 1], subset=['A', 'C'])
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-8-4c4a476386f3> in <module>()
----> 1 df.dropna(axis=[0, 1], subset=['A', 'C'])
~/dev/pandas/pandas/core/frame.py in dropna(self, axis, how, thresh, subset, inplace)
4265 for ax in axis:
4266 result = result.dropna(how=how, thresh=thresh, subset=subset,
-> 4267 axis=ax)
4268 else:
4269 axis = self._get_axis_number(axis)
~/dev/pandas/pandas/core/frame.py in dropna(self, axis, how, thresh, subset, inplace)
4276 check = indices == -1
4277 if check.any():
-> 4278 raise KeyError(list(np.compress(check, subset)))
4279 agg_obj = self.take(indices, axis=agg_axis)
4280
KeyError: ['A', 'C']
Problem description
Subset selects columns/rows from the "other" axis. Passing in the same subset of labels to both axis calls does not really make sense. Unless the subset labels are present on both axes, the function will always throw a KeyError
.
I would expect that subset can take in a list of subsets for each axis, when axis is_list_like
. If this is agreeable, I'm happy to submit a PR with this change.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: bd4332f
python: 3.6.3.final.0
python-bits: 64
OS: Darwin
OS-release: 17.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.23.0rc2+13.gbd4332f4b
pytest: 3.2.1
pip: 9.0.1
setuptools: 36.5.0.post20170921
Cython: 0.26.1
numpy: 1.13.3
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 6.1.0
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.1.0
openpyxl: 2.4.8
xlrd: 1.1.0
xlwt: 1.2.0
xlsxwriter: 1.0.2
lxml: 4.1.0
bs4: 4.6.0
html5lib: 0.999999999
sqlalchemy: 1.1.13
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None