-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify Configs #10
Comments
working on this issue in |
I'd like to merge the dev branch as soon as possible to unblock other issues. I'll use this comment to keep track of tasks which we should do, but which can be deferred until after merging. We should review and create an issue for each of these upon closing this ticket.
|
I have updated the branch to enable train_model to run with the updated configs design and HookedTransformer. notes:
|
Addressed by #20 |
tl;dr: model/training/dataset configuration is a mess and needs to be reworked
Currently, we have 3 different configurations:
the problem we've run into so far is that its easy to forget to adjust the vocab size in the model config, which should be inherited somehow from the dataset config. Moving forward, we also want to consider that we want to account for:
my naive instinct is that the structure of configs should be something like the following:
things like vocab size could be properties made to inherit from the
DatasetConfig
, and thus raise exceptions if the required config is not provided, rather than silently give the wrong valuesThe text was updated successfully, but these errors were encountered: