Skip to content

gpt-tfjs only repeats the last prompt token #656

@JulienVig

Description

@JulienVig

Here are some issues with gpt-tfjs I noted while implementing tokenization:

  • There is a memory leak in the training loop. The memory doesn't grow much (~0.01MB per iteration) but the number of tensors keep growing (+14 new tensors allocated per dataset batch)
  • A trained model (e.g. on wikitext with iteration>1000, validation perplexity<4) almost always repeats the last prompt token
  • Create a test case for the wikitext task

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingdiscojsRelated to Disco.js

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions