Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
.
Issue Description
When constructing this MultiIndex directly the sort works.
>>> midx = MultiIndex.from_tuples([("usd", "usd"), ("eur", "eur")], names=["a", "b"])
>>> midx.sortlevel()
(MultiIndex([('eur', 'eur'),
('usd', 'usd')],
names=['a', 'b']),
array([1, 0]))
When constructing the index indirectly and dynamically through a loop it fails:
>>> df = DataFrame(index=[0,1], columns=MultiIndex.from_tuples([], names=["a", "b"]))
>>> df.loc[:, ("usd", "usd")] = [10, 20]
>>> df.loc[:, ("eur", "eur")] = [15, 30]
>>> df.columns.sortlevel()
(MultiIndex([('usd', 'usd'),
('eur', 'eur')],
names=['a', 'b']),
array([0, 1]))
Expected Behavior
I have exactly the same concerns as the OP in issue #49016, which has the same issue after set_levels
.
This also occurs on pandas 2.0.0
Incidentally, df.columns.sort_values()
does sort the above in all cases, but this is also confusing since sort_values
is not documented specifically as a MultiIndex
method, but rather an Index
method, whereas sort_levels
is a MultiIndex
method with confusing wording, and no See Also
pointers to sort_values
.
Installed Versions
1.5.3