Skip to content

Update evaluation logic for dashboard support #62

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions src/autogluon/bench/eval/evaluation/benchmark_evaluator.py
Original file line number Diff line number Diff line change
Expand Up @@ -157,8 +157,15 @@ def _load_results(
return results_raw

def load_results_raw(self, paths: list) -> pd.DataFrame:
paths = [path if is_s3_url(path) else self.results_dir_input + path for path in paths]
return pd.concat([pd.read_csv(path) for path in paths], ignore_index=True, sort=True)
dataframes = []
for path in paths:
path = path if is_s3_url(path) else os.path.join(self.results_dir_input, path)
dataframe = pd.read_csv(path)
dataframes.append(dataframe)
# Discarding extra folds
min_num_rows = min(len(df) for df in dataframes)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there are multiple datasets in results file? min() will not do what it's intended right?

trimmed_dataframes = [df[:min_num_rows] for df in dataframes]
return pd.concat(trimmed_dataframes, ignore_index=True, sort=True)
Comment on lines +160 to +168
Copy link
Contributor

@Innixma Innixma Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not discard extra folds properly. Please add a unit test and separate out the filtering logic so it is not hard-coded into the load_results_raw method.

  1. Not all DataFrames loaded will have the same number of methods or datasets, so trimming by length of rows will not work.
  2. We don't want to always filter extra folds. This should be a post-load operation that is optional.
  3. You are assuming the input file is sorted by fold. This is not a valid assumption.


def _check_results_valid(self, results_raw: pd.DataFrame):
if results_raw[METRIC_ERROR].min() < 0:
Expand Down