-
Notifications
You must be signed in to change notification settings - Fork 505
Open
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed
Description
Make this table better and cover key info for model architecture - whether it uses parallel attn & MLPs, and what positional embedding it is.
Add text at the bottom documenting the models more qualitatively, can basically copy this glossary: https://docs.google.com/document/d/1WONBzNqfKIxERejrrPlQMyKqg7jSFW92x5UMXNrMdPo/edit#heading=h.chq47zvs9cii
I'd want to add a separate table with training info: include training dataset, number of tokens, whether they were trained with dropout, whether they have checkpoints, whether trained with weight decay.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed