Description
This is not a real 'bug' maybe? But it can be annoying.
xref #12614. Independent of issue #12642.
Code Sample
a = np.array(['foo', 'foo', 'foo', 'bar', 'bar', 'foo', 'foo'], dtype=object)
b = np.array(['one', 'one', 'two', 'one', 'two', 'two', 'two'], dtype=object)
c = np.array(['dull', 'dull', 'dull', 'dull', 'dull', 'shiny', 'shiny'], dtype=object)
res_dropnaF = pd.crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c'], dropna=False)
res_dropnaT = pd.crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c'], dropna=True)
print res_dropnaF, '\n\n', res_dropnaT
Results:
res_dropnaF
shows nocolumns.names
(because they're removed) while res_dropnaT
does.
one two
dull shiny dull shiny
a
bar 1 0 1 0
foo 2 0 1 2
b one two
c dull dull shiny
a
bar 1 1 0
foo 2 1 2
Expected Output
Extra option or consistency on column names.
Should be fixed by working on pivot_table(): if not dropna
output of pd.show_versions()
INSTALLED VERSION:
commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.17.1
nose: 1.3.3
pip: 8.1.0
setuptools: 18.3.1
Cython: 0.23.4
numpy: 1.10.1
scipy: 0.14.0
statsmodels: 0.5.0
IPython: 2.1.0
sphinx: None
patsy: 0.2.1
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.3.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
Jinja2: None