Open
Description
Currently, from_product
always sorts levels in the resulting MultiIndex. This means that the result does not necessarily have lexsorted labels/codes.
PR #14062 adds an option to not sort levels when calling from_product
. Compare:
In [4]: pd.MultiIndex.from_product([['a', 'b'], [2, 1, 0]], sort_levels=False)
Out[4]:
MultiIndex(levels=[['a', 'b'], [2, 1, 0]],
labels=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 0, 1, 2]],
sortorder=0)
In [5]: pd.MultiIndex.from_product([['a', 'b'], [2, 1, 0]], sort_levels=True)
Out[5]:
MultiIndex(levels=[['a', 'b'], [0, 1, 2]],
labels=[[0, 0, 0, 1, 1, 1], [2, 1, 0, 2, 1, 0]])
Using this option yields a few benefits:
- It's simpler -- resulting levels on the MultiIndex are exactly those you passed in.
- It's marginally faster -- you don't need to sort the levels.
- The resulting MultiIndex is always lex-sorted. This is handy if you want to be able to index it efficiently.
The downside is that the result can be a little less intuitive, because levels and labels do not have the same sort order (#14015).
I'm suggesting this option because it was useful for xarray (to fix pydata/xarray#980) and might also be relevant for other advanced users.