Skip to content

Anonymous-user-00/FLoRIST

Repository files navigation

FLoRIST: Singular Value Thresholding for Efficient and Accurate Federated Fine-Tuning of Large Language Models

This repository contains the official implementation of FLoRIST, a framework for communication-efficient and performance-preserving federated fine-tuning of large language models (LLMs) using low-rank adapters and singular value thresholding. FLoRIST has been submitted to NeurIPS 2025.

In federated learning settings, where training data remains decentralized across clients (e.g., institutions or devices), fine-tuning large models becomes challenging due to communication and computational constraints. Parameter-efficient fine-tuning (PEFT) methods such as LoRA allow clients to train compact low-rank adapters locally, but aggregating these adapters efficiently and effectively remains an open problem—especially under heterogeneous client configurations.

FLoRIST addresses this by:

  • Aggregating directly in the low-rank latent space, avoiding the construction of full dense update matrices.
  • Applying singular value thresholding to retain only the most informative components, enabling compact and performant global adapters.
  • Supporting heterogeneous local ranks across clients without requiring full-rank communication or complex coordination.
  • Introducing two variants: FLoRIST-O for optimal performance, and FLoRIST-E for maximum communication efficiency.

FLoRIST outperforms state-of-the-art baselines such as FedIT, FLoRA, FlexLoRA, and FFA-LoRA across multiple datasets (Dolly, Alpaca, WizardLM) and model scales (TinyLlama, Llama-3.2-1B, Llama-7B), achieving lower communication while matching or exceeding their accuracy.

FLoRIST Workflow

Requirements

To install the necessary dependencies, run:

pip install -r requirements.txt

Datasets

We follow the same data format and directory structure as used in the original FLoRA repository. All datasets are stored in JSON format and should be placed in the appropriate folders.

Available Datasets

If you want to use a custom dataset, make sure it follows the same instruction-response JSON format as these folders. Each sample should contain fields such as "instruction", "input" (optional), and "output".

Training

To train a model in a homogeneous federated setting:

FLoRIST

python3 main.py --global_model 'tinyllama' \
  --data_path "./data" \
  --output_dir './tinyllama-dolly-homo-1-3-8/' \
  --num_communication_rounds 1 \
  --local_num_epochs 3 \
  --florist True \
  --num_clients 8 \
  --threshold 0.9

FLoRA

python3 main.py --global_model 'tinyllama' \
  --data_path "./data" \
  --output_dir './tinyllama-dolly-homo-1-3-8/' \
  --num_communication_rounds 1 \
  --local_num_epochs 3 \
  --stacking True \
  --num_clients 8

FedIT

python3 main.py --global_model 'tinyllama' \
  --data_path "./data" \
  --output_dir './tinyllama-dolly-homo-1-3-8/' \
  --num_communication_rounds 1 \
  --local_num_epochs 3 \
  --num_clients 8

FlexLoRA

python3 main.py --global_model 'tinyllama' \
  --data_path "./data" \
  --output_dir './tinyllama-dolly-homo-1-3-8/' \
  --num_communication_rounds 1 \
  --local_num_epochs 3 \
  --flex True \
  --num_clients 8

FFA-LoRA

python3 main.py --global_model 'tinyllama' \
  --data_path "./data" \
  --output_dir './tinyllama-dolly-homo-1-3-8/' \
  --num_communication_rounds 1 \
  --local_num_epochs 3 \
  --ffa True \
  --num_clients 8

To train in a heterogeneous client rank setup, add --heter True. For methods that do not support heterogeneity (e.g., FedIT and FFA-LoRA), add --zero_padding True.

Example:

python3 main.py --global_model 'huggyllama/llama-7b' \
  --data_path "./data_wiz" \
  --output_dir './llama7b-wiz-heter-1-1-8/' \
  --num_communication_rounds 1 \
  --local_num_epochs 3 \
  --florist True \
  --num_clients 8 \
  --threshold 0.80 \
  --heter True

Evaluation

All training runs automatically evaluate the final global model on the MMLU benchmark and report accuracy at the end.

Results

FLoRIST achieves state-of-the-art trade-offs between accuracy and communication efficiency across all model sizes, datasets, and client types.

Results on Homogeneous Setting (MMLU Benchmark)

Model Method Dolly Acc (%) Dolly Eff. (×10⁻⁴) Alpaca Acc (%) Alpaca Eff. (×10⁻⁴) Wizard Acc (%) Wizard Eff. (×10⁻⁴)
TinyLlama FedIT 28.88 14.20 31.99 14.20 41.42 14.20
FLoRA 27.48 1.78 29.09 1.78 41.99 1.78
FlexLoRA 28.03 14.20 29.00 14.20 42.53 14.20
FFA-LoRA 24.74 28.40 25.57 28.40 26.31 28.40
FLoRIST-O 30.42 (τ=0.87) 45.40 29.81 (τ=0.93) 34.36 43.63 (τ=0.99) 16.92
FLoRIST-E 29.25 (τ=0.80) 76.30 29.43 (τ=0.84) 63.30 42.39 (τ=0.82) 73.50
Llama-7b FedIT 34.75 9.77 27.38 9.77 28.50 9.77
FLoRA 34.38 1.22 26.34 1.22 28.50 1.22
FlexLoRA 33.88 9.77 26.27 9.77 28.69 9.77
FFA-LoRA 31.52 19.50 22.69 19.50 28.34 19.50
FLoRIST-O 35.58 (τ=0.95) 21.40 29.05 (τ=0.85) 57.47 29.25 (τ=0.95) 29.41
FLoRIST-E 34.45 (τ=0.85) 51.02 28.30 (τ=0.80) 70.90 29.14 (τ=0.87) 52.90
Llama-3.2-1B FedIT 19.07 19.50 25.99 19.50 27.27 19.50
FLoRA 18.97 2.44 30.34 2.44 27.48 2.44
FlexLoRA 19.45 19.50 30.16 19.50 27.01 19.50
FFA-LoRA 19.59 39.06 18.68 39.06 28.01 39.06
FLoRIST-O 20.68 (τ=0.95) 37.59 30.29 (τ=0.99) 18.10 28.29 (τ=0.95) 38.80
FLoRIST-E 19.95 (τ=0.82) 64.93 29.66 (τ=0.80) 94.30 27.18 (τ=0.82) 87.70

Bold = Highest, Italic = Second-Highest

See the full table in the [paper] for all datasets and baseline comparisons.

License

This repository is released under the Apache License 2.0. See LICENSE for details.

Contributing

We welcome pull requests for reproducibility enhancements, dataset loaders, and benchmarking scripts. For major changes, please open an issue first to discuss what you would like to change.

About

Official implementation of FLoRIST: efficient and accurate federated fine-tuning of LLMs

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •