Skip to content

Ensemble builder #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

franchuterivera
Copy link
Contributor

This is a proof of concept PR. It is fully functional, yet it assumes we cannot merge the metrics from Auto-Sklearn and Auto-Pytorch.

Today they differ slightly and for those reasons, I have proposed this or to take in a loss function when doing ensemble selection.

It also removed many lines that are legacy.

) -> "AbstractEnsemble":
return self

def get_identifiers_from_run_history(self) -> List[Tuple[int, int, float]]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can write the code more concisely as follows:

min_idx = np.array([run_value.cost for run_value in self.run_history.data.values()]).argmin()
min_key = list(self.run_history.data.keys())[min_idx]
min_run_value = self.run_history.data[min_key]

if min_run_value < best_model_loss:
    ...
else:  # SMAC did not work properly
    raise ValueError

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But there are plenty of this to check, like if the path exists (it might have been deleted by ensemble builder), among others. This suggestion does not align to that. Yes, one could do list comprehension with filtering, but then it is not much different than the actual code...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check here later (my time schedule is temporally tight..., I think I can check on this weekend)

self.backend = backend

# Add some default values -- at least 1 model in ensemble is assumed
self.indices_ = [0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why indices_, but not _indices?
foo_ is for reserved words such as class, list.
_foo is for private variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is a private variable?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do know the usage of indices, so I'm partially asking.
If you use indices_, you should specify the intentions.

)
s = len(ensemble)
if s > 0:
np.add(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part consumes the same amount as the following:

weighted_ensemble_prediction += ensemble[-1]

It is shorter and understandable.

)
s = len(ensemble)
if s > 0:
np.add(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part consumes the same amount as the following:

weighted_ensemble_prediction += ensemble[-1]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants