EfficientViT

paper | poster

About EfficientViT Models

EfficientViT is a new family of vision models for efficient high-resolution vision, especially segmentation. The core building block of EfficientViT is a new lightweight multi-scale attention module that achieves global receptive field and multi-scale learning with only hardware-efficient operations.

Here are comparisons with prior SOTA semantic segmentation models:

Here are the results of EfficientViT on image classification:

Getting Started

Installation

conda create -n efficientvit python=3.8.5
conda activate efficientvit
conda install pytorch=1.13.1 torchvision=0.14.1 pytorch-cuda=11.7 -c pytorch -c nvidia
pip install tqdm opencv-python

Dataset

ImageNet: https://www.image-net.org/
Cityscapes: https://www.cityscapes-dataset.com/
ADE20K: https://groups.csail.mit.edu/vision/datasets/ADE20K/

Download Pretrained Models

Mobile latency is measured on Qualcomm Snapdragon 8Gen1 with Tensorflow-Lite, fp32, batch size 1.

ImageNet

Model	Resolution	ImageNet Top1 Acc	ImageNet Top5 Acc	Params	MACs	Mobile Latency	Checkpoint
EfficientViT-B1	224	79.4	94.3	9.1M	0.52G	19ms	link
EfficientViT-B1	256	79.9	94.7	9.1M	0.68G	24ms	link
EfficientViT-B1	288	80.4	95.0	9.1M	0.86G	31ms	link
EfficientViT-B2	224	82.1	95.8	24M	1.6G	55ms	link
EfficientViT-B2	256	82.7	96.1	24M	2.1G	72ms	link
EfficientViT-B2	288	83.1	96.3	24M	2.6G	92ms	link
EfficientViT-B3	224	83.5	96.4	49M	4.0G	140ms	link
EfficientViT-B3	256	83.8	96.5	49M	5.2G	180ms	link
EfficientViT-B3	288	84.2	96.7	49M	6.5G	228ms	link

Cityscapes

Model	Resolution	Cityscapes mIoU	Params	MACs	Mobile Latency	Checkpoint
EfficientViT-B0	960x1920	75.5	0.7M	3.9G	0.20s	link
EfficientViT-B1	896x1792	80.1	4.8M	19G	0.82s	link
EfficientViT-B2	1024x2048	82.1	15M	74G	3.1s	link
EfficientViT-B3	1184x2368	83.2	40M	240G	10s	link

ADE20K

Model	Resolution	ADE20K mIoU	Params	MACs	Mobile Latency	Checkpoint
EfficientViT-B1	480	42.7	4.8M	2.7G	0.10s	link
EfficientViT-B2	416	45.1	15M	6.0G	0.21s	link
EfficientViT-B3	512	49.0	39M	22G	0.8s	link

Usage

from models.cls_model_zoo import create_cls_model

model = create_cls_model(
  name="b3", 
  pretrained=True, 
  weight_url="assets/checkpoints/cls/b3-r288.pt"
)

from models.seg_model_zoo import create_seg_model

model = create_seg_model(
  name="b3", 
  dataset="cityscapes", 
  pretrained=True, 
  weight_url="assets/checkpoints/seg/cityscapes/b3-r1184.pt"
)

from models.seg_model_zoo import create_seg_model

model = create_seg_model(
  name="b3", 
  dataset="ade20k", 
  pretrained=True, 
  weight_url="assets/checkpoints/seg/ade20k/b3-r512.pt"
)

Evaluation

Please run eval_cls_model.py or eval_seg_model.py to evaluate our models.

Examples: classification, segmentation

Visualization

Please run eval_seg_model.py to visualize the outputs of our semantic segmentation models.

Example:

python eval_seg_model.py --dataset cityscapes --crop_size 1184 --model b3-r1184 --save_path demo/cityscapes/b3-r1184/

Contact

Han Cai: hancai@mit.edu

Citation

If EfficientViT is useful or relevant to your research, please kindly recognize our contributions by citing our paper:

@article{cai2022efficientvit,
  title={Efficientvit: Enhanced linear attention for high-resolution low-computation visual recognition},
  author={Cai, Han and Gan, Chuang and Han, Song},
  journal={arXiv preprint arXiv:2205.14756},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets/efficientvit_files		assets/efficientvit_files
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval_cls_model.py		eval_cls_model.py
eval_seg_model.py		eval_seg_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EfficientViT

paper | poster

About EfficientViT Models

Getting Started

Installation

Dataset

Download Pretrained Models

ImageNet

Cityscapes

ADE20K

Usage

Evaluation

Visualization

Contact

Citation

About

Releases

Packages

Languages

License

figurekim317/efficientvit

Folders and files

Latest commit

History

Repository files navigation

EfficientViT

paper | poster

About EfficientViT Models

Getting Started

Installation

Dataset

Download Pretrained Models

ImageNet

Cityscapes

ADE20K

Usage

Evaluation

Visualization

Contact

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages