Applying Parameter-Efficient Fine-Tuning (PEFT) to a Large Language Model (LLM)

This example project shows how to fine-tune a Large Language Model using the PEFT library from HuggingFace.

The HuggingFace library transformers in combination with peft makes it very easy to fine-tune Large Language Models (LLMs) for our specific tasks. This small project shows how to use those libraries end-to-end to perform a text classification task.

Specifically:

We use the ag_news dataset, which consists of 120k news texts, each of them with a label related to its associated topic: 'World', 'Sports', 'Business', 'Sci/Tech'.
The DistilBERT model is fine-tuned for the news classification task. In the process, Low-Rank Adaptation (LoRA) is used to accelerate the fine-tuning thanks to the peft library.

The underlying LLM is abstracted and easily handled thanks to the transformers library; the user only needs to understand basic concepts such as

Tokenization of text sequences
Embedding vectors of tokens and associated dimensions
The motivation and usage of the encoder & decoder modules in LLMs
Task-specific heads, such as classification

For a primer in those topics, you can visit

Setup

A recipe to set up a conda environment with the required dependencies:

# Create the necessary Python environment
conda env create -f conda.yaml
conda activate peft

# Compile and install all dependencies
pip-compile requirements.in
pip-sync requirements.txt

# If we need a new dependency,
# add it to requirements.in 
# And then:
pip-compile requirements.in
pip-sync requirements.txt

Notebook

The notebook llm_peft.ipynb contains all the code and explanations necessary to perform the aforementioned fine-tuning.

Interesting Links

My personal notes on the O'Reilly book Generative Deep Learning, 2nd Edition, by David Foster
My personal notes on the O'Reilly book Natural Language Processing with Transformers, by Lewis Tunstall, Leandro von Werra and Thomas Wolf (O'Reilly)
My personal notes and guide for the Generative AI Nanodegree from Udacity
HuggingFace Guide: mxagar/tool_guides/hugging_face
LangChain Guide: mxagar/tool_guides/langchain
LLM Tools: mxagar/tool_guides/llms
NLP Guide: mxagar/nlp_guide
Deep Learning Methods for CV and NLP: mxagar/computer_vision_udacity/CVND_Advanced_CV_and_DL.md
Deep Learning Methods for NLP: mxagar/deep_learning_udacity/DLND_RNNs.md

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
assets		assets
.gitconfig		.gitconfig
.gitignore		.gitignore
Blog_Post.md		Blog_Post.md
README.md		README.md
conda.yaml		conda.yaml
llm_peft.ipynb		llm_peft.ipynb
requirements.in		requirements.in
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Applying Parameter-Efficient Fine-Tuning (PEFT) to a Large Language Model (LLM)

Setup

Notebook

Interesting Links

About

Uh oh!

Releases

Packages

Languages

mxagar/llm_peft_fine_tuning_example

Folders and files

Latest commit

History

Repository files navigation

Applying Parameter-Efficient Fine-Tuning (PEFT) to a Large Language Model (LLM)

Setup

Notebook

Interesting Links

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages