-
Notifications
You must be signed in to change notification settings - Fork 31
Closed
Labels
Milestone
Description
Here are some issues with gpt-tfjs I noted while implementing tokenization:
- There is a memory leak in the training loop. The memory doesn't grow much (~0.01MB per iteration) but the number of tensors keep growing (+14 new tensors allocated per dataset batch)
- A trained model (e.g. on wikitext with iteration>1000, validation perplexity<4) almost always repeats the last prompt token
- Create a test case for the wikitext task
Reactions are currently unavailable