Sparse Iso-FLOP Transformations for Maximizing Training Efficiency

This is the official repository used to run the experiments in the paper that proposed Sparse-IFT. The codebase is implemented in PyTorch.

Sparse ISO-FLOP Transformations for Maximizing Training Efficiency

Vithursan Thangarasa, Shreyas Saxena, Abhay Gupta, Sean Lie

The paper discusses novel transformations aimed at maximizing the training efficiency (test accuracy w.r.t training FLOPs) of deep neural networks by introducing a family of Sparse Iso-FLOP Transformations.

Different members of the Sparse-IFT family. Transformation of all members is parameterized by a single hyperparameter (i.e., sparsity level ($s$)). Black and white squares denote sparse and active weights, respectively. Green block indicates a non-linear activation function (e.g., BatchNorm, ReLU, LayerNorm). All transformations are derived with sparsity set to 50% as an example, are Iso-FLOP to the dense feedforward function $f_{θ_l}$, and hence can be used as a drop-in replacement of $f_{θ_l}$. See Section 2 of Sparse ISO-FLOP Transformations for Maximizing Training Efficiency for more details about each member.

Dependencies:

pytorch
numpy
tensorboard

Prerequisites

Before running experiments or utilizing the code in this repository, please ensure the following prerequisites are met:

Set the environment variable CODE_SOURCE_DIR to the path of the source code directory. This can be achieved using the following command:

export CODE_SOURCE_DIR=/path/to/your/source/code
# e.g., export CODE_SOURCE_DIR=/Users/$USER/Documents/Sparse-IFT/ComputerVision

Create a conda environment using the provided environment configuration file.

cd Sparse-IFT
conda env create -f sparseift_env.yaml

Activate the environment:

conda activate sparseift_env

Quick Start: CIFAR-100 Experiments

To run CIFAR-100/ImageNet experiments, enter the ComputerVision directory:

cd ComputerVision/

ResNet-18 on CIFAR-100

Dense Baseline

python launch_utils/prepare_job_commands.py --job-name resnet18_cifar100_dense_baseline --base-dir /path/to/experiment/directory/ --base-cfg CIFAR/configs/resnet18/base.yaml --exp-cfg CIFAR/configs/resnet18/dense_baseline.yaml --run-cifar

Sparse-IFT ResNet-18 on CIFAR-100

This section provides instructions for running experiments on CIFAR-100 using the ResNet-18 model with different configurations of the Sparse-IFT family using the dynamic sparsity algorithm RigL. RigL introduces dynamic sparsity to optimize the training efficiency of deep neural networks.

Sparse-Wide + RigL

python launch_utils/prepare_job_commands.py --job-name resnet18_cifar100_sparsewide --base-dir /path/to/experiment/directory/ --base-cfg CIFAR/configs/resnet18/base.yaml --exp-cfg CIFAR/configs/resnet18/sparseift/sparsewide_rigl.yaml --run-cifar

Sparse-Parallel + RigL

python launch_utils/prepare_job_commands.py --job-name resnet18_cifar100_sparseparallel --base-dir /path/to/experiment/directory/ --base-cfg CIFAR/configs/resnet18/base.yaml --exp-cfg CIFAR/configs/resnet18/sparseift/sparseparallel_rigl.yaml --run-cifar

Sparse-Factorized + RigL

python launch_utils/prepare_job_commands.py --job-name resnet18_cifar100_sparsedoped --base-dir /path/to/experiment/directory/ --base-cfg CIFAR/configs/resnet18/base.yaml --exp-cfg CIFAR/configs/resnet18/sparseift/sparsedoped_rigl.yaml --run-cifar

Sparse-Doped + RigL

python launch_utils/prepare_job_commands.py --job-name resnet18_cifar100_sparsefactorized --base-dir /path/to/experiment/directory/ --base-cfg CIFAR/configs/resnet18/base.yaml --exp-cfg CIFAR/configs/resnet18/sparseift/sparsefactorized_rigl.yaml --run-cifar

Quick Start: ImageNet Experiments

Dense Baseline

python launch_utils/prepare_job_commands.py --job-name resnet18_imagenet_dense_baseline --base-dir /path/to/experiment/directory/ --base-cfg ImageNet/configs/resnet18/base.yaml --exp-cfg ImageNet/configs/resnet18/dense_baseline.yaml

Sparse-IFT ResNet-18 on ImageNet

This section provides instructions for running experiments on ImageNet using the ResNet-18 model with different configurations of the Sparse-IFT family using the dynamic sparsity algorithm RigL.

Sparse-Wide + RigL

python launch_utils/prepare_job_commands.py --job-name resnet18_imagenet_sparsewide --base-dir /path/to/experiment/directory/ --base-cfg ImageNet/configs/resnet18/base.yaml --exp-cfg ImageNet/configs/resnet18/sparseift/sparsewide_rigl.yaml

Sparse-Parallel + RigL

python launch_utils/prepare_job_commands.py --job-name resnet18_imagenet_sparseparallel --base-dir /path/to/experiment/directory/ --base-cfg ImageNet/configs/resnet18/base.yaml --exp-cfg ImageNet/configs/resnet18/sparseift/sparseparallel_rigl.yaml

Citation

If you find this work helpful or use the provided code in your research, please consider citing our paper:

@InProceedings{pmlr-v235-thangarasa24a,
  title = 	 {Sparse-{IFT}: Sparse Iso-{FLOP} Transformations for Maximizing Training Efficiency},
  author =       {Thangarasa, Vithursan and Saxena, Shreyas and Gupta, Abhay and Lie, Sean},
  booktitle = 	 {Proceedings of the 41st International Conference on Machine Learning},
  year = 	 {2024},
  volume = 	 {235},
  series = 	 {Proceedings of Machine Learning Research},
  publisher =    {PMLR},
  url = 	 {https://proceedings.mlr.press/v235/thangarasa24a.html},
}

Feel free to adapt the paths, configurations, and commands based on your specific setup.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
ComputerVision		ComputerVision
assets		assets
cbsparse		cbsparse
cbutils		cbutils
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
sparseift_env.yaml		sparseift_env.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sparse Iso-FLOP Transformations for Maximizing Training Efficiency

Dependencies:

Prerequisites

Quick Start: CIFAR-100 Experiments

ResNet-18 on CIFAR-100

Dense Baseline

Sparse-IFT ResNet-18 on CIFAR-100

Sparse-Wide + RigL

Sparse-Parallel + RigL

Sparse-Factorized + RigL

Sparse-Doped + RigL

Quick Start: ImageNet Experiments

Dense Baseline

Sparse-IFT ResNet-18 on ImageNet

Sparse-Wide + RigL

Sparse-Parallel + RigL

Citation

About

Releases

Packages

Contributors 3

Languages

License

CerebrasResearch/Sparse-IFT

Folders and files

Latest commit

History

Repository files navigation

Sparse Iso-FLOP Transformations for Maximizing Training Efficiency

Dependencies:

Prerequisites

Quick Start: CIFAR-100 Experiments

ResNet-18 on CIFAR-100

Dense Baseline

Sparse-IFT ResNet-18 on CIFAR-100

Sparse-Wide + RigL

Sparse-Parallel + RigL

Sparse-Factorized + RigL

Sparse-Doped + RigL

Quick Start: ImageNet Experiments

Dense Baseline

Sparse-IFT ResNet-18 on ImageNet

Sparse-Wide + RigL

Sparse-Parallel + RigL

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages