gpt-tfjs only repeats the last prompt token

Here are some issues with gpt-tfjs I noted while implementing tokenization:
- [x] There is a memory leak in the training loop. The memory doesn't grow much (~0.01MB per iteration) but the number of tensors keep growing (+14 new tensors allocated per dataset batch)
- [x] A trained model (e.g. on wikitext with iteration>1000, validation perplexity<4) almost always repeats the last prompt token
- [x] Create a test case for the wikitext task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpt-tfjs only repeats the last prompt token #656

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

gpt-tfjs only repeats the last prompt token #656

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions