Skip to content

log-y/llm-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llm-demo

Clone the repo, install the dependencies in the requirements.txt file, then run 'django manage.py runserver'

A brief summary:

  • 100 million weights/biases
  • Trained on 35 GB of text data (pulled from Common Crawl)
  • Took $15-20 dollars to train (using cloud resources)
  • Ran 1 epoch (unfortunately ran out of money I was willing to spend)

Things I learned:

  • I learned the basics of the transformer architecture: how the self-attention mechanisms will analyze other tokens for relevance, how the MLP can draw connections between vector embeddings, etc
  • I learned how to train models in the cloud using SSH keys
  • Also gained a lot of experience coding in Python and using libraries that compiled to C/C++ for optimization

Here are some pictures of what it does: Enter a question: l1 Get the response: l2 Look at the predicted next tokens for every step: l3

Note: I uploaded the entire weights/parameters file using Git LFS, which only allows two 'git clones' of the file per month (otherwise it exceeds the 1GB limit). So if you're having any issue cloning and running the repository, it may be because too many other people have loaded it before. This also is the reason I am having some trouble deploying to Vercel (vercel needs to clone the repository to host it).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published