Improve evaluation metrics and pipeline #109

afspies · 2023-03-23T12:48:53Z

This PR Will contain:

Consolidation of metrics for evaluating models
Helpers for executing and logging the full suite of evaluations
Restructuring of plotting scripts
Changes to plotting function outputs (figures, axes etc.) for unified interface with wandb logging

This PR will NOT contain (required for proper integration):

Integration of evaluation pipeline into current training script (awaiting further pushes from @rusheb )
Proper command-line based overriding of config parameters required for sweeping and partially accounted for in evaluation pipeline (I will do this manually in my own training sweep script for the time being)

All contents of maze_transformer/evaluation will be affected. Only minor changes to imports and usage of plot.show are to be expected in maze_transformer/evaluation/eval_model.py and notebooks/eval_model.ipynb

mivanit · 2024-07-26T17:28:04Z

stale, some evals rework will happen in #207

init branch

dd9f227

mivanit closed this Jul 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve evaluation metrics and pipeline #109

Improve evaluation metrics and pipeline #109

afspies commented Mar 23, 2023 •

edited

Loading

mivanit commented Jul 26, 2024

Improve evaluation metrics and pipeline #109

Improve evaluation metrics and pipeline #109

Conversation

afspies commented Mar 23, 2023 • edited Loading

mivanit commented Jul 26, 2024

afspies commented Mar 23, 2023 •

edited

Loading