This repository contains datasets and resources for training and fine-tuning AI models for llama-related text generation. We use TextGenWebUI, Hugging Face's Transformers library, and the LoRA (Low Rank Adaptation) fine-tuning technique.
- This dataset contains a large collection of llama-related text data. It's designed to be used for training large AI models, enabling them to generate llama-themed content with a broad vocabulary.
- This dataset is a smaller subset of
reviews_large.json
. It's suitable for quick experimentation and testing of models. Use this dataset if you want to iterate rapidly during development.
- This dataset is an extension of
reviews_large.json
and includes additional data related to alpacas, which can be useful for creating AI models that generate content about both llamas and alpacas.
- Similar to
reviews_small.json
, this dataset is a compact version ofreviews_large_alpaca.json
. It's ideal for quick prototyping and experimenting with models that incorporate both llama and alpaca content.
-
For more information on TextGenWebUI, please visit their official documentation.
-
To explore Hugging Face's Transformers library and discover pre-trained models, check out their GitHub repository.