Partial Embedding Matrix Adaptation

Introduction

This repository contains the implementation for the paper Vocabulary-level Memory Efficiency for Language Model Fine-tuning.

Partial Embedding Matrix Adaptation is a simple technique that can reduce the memory footprint of language model fine-tuning without impacting performance.

Installation

pip install git+https://github.com/mlsw/partial-embedding-matrix-adaptation.git

Usage

Hugging Face Transformers

There is a high-level API for Hugging Face Transformers PyTorch models via the HFEmbeddingPruner class.

from partial_embedding_matrix_adaptation import HFEmbeddingPruner

embedding_pruner = HFEmbeddingPruner(model)
dataset, _ = embedding_pruner.prepare_model(tokenizer, dataset)

Please see examples/distilbert_sst2.py for a complete example. Additionally, the scripts in the utils directory show how to use this API with the Hugging Face Transformers Trainer.

PyTorch

Alternatively, the EmbeddingPruner class can be used directly for PyTorch models. Please see HFEmbeddingPruner for an example of how to use this.

Reproducibility

The following scripts can be used to reproduce the results from the paper. These are adapted from Hugging Face Transformers PyTorch Examples with support for Partial Embedding Matrix Adaptation.

Task	Script	Documentation
GLUE	run_glue_pema.py	Here
XNLI	run_xnli_pema.py	Here

License

This project is licensed under the terms of the MIT license. Please see LICENSE for more details.

Citation

If you found this work useful, please consider citing our paper:

@misc{williams-aletras-2025-vocabulary,
  title={Vocabulary-level Memory Efficiency for Language Model Fine-tuning}, 
  author={Miles Williams and Nikolaos Aletras},
  year={2025},
  eprint={2309.08708},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2309.08708}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
examples		examples
images		images
src/partial_embedding_matrix_adaptation		src/partial_embedding_matrix_adaptation
tests		tests
utils		utils
.gitignore		.gitignore
CITATION.bib		CITATION.bib
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Partial Embedding Matrix Adaptation

Introduction

Installation

Usage

Hugging Face Transformers

PyTorch

Reproducibility

License

Citation

About

Uh oh!

Uh oh!

Languages

License

mlsw/partial-embedding-matrix-adaptation

Folders and files

Latest commit

History

Repository files navigation

Partial Embedding Matrix Adaptation

Introduction

Installation

Usage

Hugging Face Transformers

PyTorch

Reproducibility

License

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages