Skip to content

Move groupby.agg logic into query compiler #1879

@ienkovich

Description

@ienkovich

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 19.04
  • Modin installed from (source or binary): source
  • Modin version: master
  • Python version: 3.8.3
  • Exact command to reproduce: python test.py

Test case:

import os

os.environ["MODIN_ENGINE"] = "ray"

import modin.pandas as pd

data = {
    "a": [1, 1, 2, 2],
    "b": [11, 21, 12, 11],
}

df = pd.DataFrame(data)
ref = df.groupby("a").agg({"b": "mean"})
print(ref)

In the execution log I see

UserWarning: `DataFrame.groupby_on_multiple_columns` defaulting to pandas implementation.

Previously we could process such aggregates in OmniSci back-end, now it's defaulted to pandas in front-end. That 'breaks' OmniSci backend (processing doesn't happen in OmniSci). Don't know when degradation happened. Probably some code was moved from query compiler to front-end.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Code Quality 💯Improvements or issues to improve quality of codebase

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions