Hyperlane AI

This project provides a complete pipeline for fine-tuning language models and converting them to GGUF format for efficient inference.

Project Overview

The pipeline includes the following steps:

Environment setup with Python virtual environment
Dependency installation
Dataset generation
Model fine-tuning with LoRA adapters
Merging LoRA adapters with the base model
Converting the merged model to GGUF format
Analyzing training arguments

Prerequisites

Python 3.8 or higher
pip (Python package installer)
Git

Setup and Usage

1. Clone the Repository

git clone <repository-url>
cd hyperlane-ai-training

2. Run the Training Pipeline

Execute the main script to run the complete pipeline:

./run.sh

This will:

Create and activate a Python virtual environment
Install all required dependencies
Generate the dataset
Fine-tune the model
Merge the LoRA adapter with the base model
Convert the merged model to GGUF format
Analyze training arguments

3. Development Mode

For faster iteration during development, you can run the pipeline in development mode which limits the number of training steps:

./run.sh dev

Configuration

The project can be configured using a .env file in the root directory. The following environment variables are available:

MERGED_MODEL_DIR: Directory for the merged model (default: "merged_model")
OUTPUT_DIR: Directory for the output files (default: "output")

Example .env file:

MERGED_MODEL_DIR=my_merged_model
OUTPUT_DIR=my_output

Project Structure

run.sh: Main execution script
generate_markdown.py: Script to generate training dataset
finetune.py: Model fine-tuning script
merge_model.py: Script to merge LoRA adapters with the base model
convert_hf_to_gguf.py: Script to convert models to GGUF format
analyze_training_args.py: Script to analyze and log training arguments
dataset/: Directory containing the training dataset

Dependencies

The project requires the following Python packages:

torch (>=2.3.0)
transformers
datasets
trl
peft
accelerate
hf_xet
gguf
mistral_common
dotenv

Output

After successful execution, the final GGUF model will be located at: $OUTPUT_DIR/$OUTPUT_DIR.gguf

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

Contact

For any inquiries, please reach out to the author at root@ltpp.vip.

Name		Name	Last commit message	Last commit date
Latest commit History 308 Commits
.github/workflows		.github/workflows
__pycache__		__pycache__
dataset		dataset
.env		.env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analyze_training_args.py		analyze_training_args.py
finetune.py		finetune.py
finetune_cpu_optimized.py		finetune_cpu_optimized.py
finetune_cpu_optimized_fixed.py		finetune_cpu_optimized_fixed.py
generate_dataset.py		generate_dataset.py
generate_markdown.py		generate_markdown.py
get_trl_info.py		get_trl_info.py
git.sh		git.sh
inference.py		inference.py
install.sh		install.sh
merge_model.py		merge_model.py
run.sh		run.sh
update_dataset.sh		update_dataset.sh
upload_to_hub.py		upload_to_hub.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hyperlane AI

Project Overview

Prerequisites

Setup and Usage

1. Clone the Repository

2. Run the Training Pipeline

3. Development Mode

Configuration

Project Structure

Dependencies

Output

License

Contributing

Contact

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

hyperlane-dev/hyperlane-ai

Folders and files

Latest commit

History

Repository files navigation

Hyperlane AI

Project Overview

Prerequisites

Setup and Usage

1. Clone the Repository

2. Run the Training Pipeline

3. Development Mode

Configuration

Project Structure

Dependencies

Output

License

Contributing

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages