-
Notifications
You must be signed in to change notification settings - Fork 670
Closed
Description
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
- Modin version (
modin.__version__): 0.7.3+200.g8e7a5682 - Python version: 3.7.5
- Code we can use to reproduce:
if __name__ == "__main__":
import modin.pandas as pd
import pandas
import numpy as np
data = {
"col1": [0, 1, 2, 3],
"col2": [4, 5, np.NaN, 7],
"col3": [np.NaN, np.NaN, 12, 10],
"col4": [17, 13, 16, 15],
"col5": [-4, -5, -6, -7],
}
md_df, pd_df = pd.DataFrame(data), pandas.DataFrame(data)
groupby_kwargs = {"by": ["col5", "col4", "col1"], "as_index": False}
md_result, pd_result = (
md_df.groupby(**groupby_kwargs).any(),
pd_df.groupby(**groupby_kwargs).any(),
)
print("pd_result:\n", pd_result, sep="")
print("\nmd_result:\n", md_result, sep="")
print("\npd_columns:", pd_result.columns)
print("md_columns:", md_result.columns)Output:
pd_result:
col5 col4 col1 col2 col3
0 -7 15 3 True True
1 -6 16 2 False True
2 -5 13 1 True False
3 -4 17 0 True False
md_result:
col5 col4 col1 col2 col3
0 -7 15 3 True True
1 -6 16 2 False True
2 -5 13 1 True False
3 -4 17 0 True False
pd_columns: Index(['col5', 'col4', 'col1', 'col2', 'col3'], dtype='object')
md_columns: Index(['col1', 'col2', 'col3', 'col4', 'col5'], dtype='object')
Describe the problem
Columns into partitions seems to be correct, but columns of dataframe itself isn't, that's also the reason why test don't fails on that test case (to_pandas that used in df_equals considers information only from partitions)
Metadata
Metadata
Assignees
Labels
bug 🦗Something isn't workingSomething isn't working