Awesome-MoE

Awesome list of Mixture-of-Experts (MoE) papers.

Kindly consider giving it a star if you find this list helpful. Thanks!

News

[2024-06-11] - Add papers of NeurIPS 2022, ICML 2022, ICML 2023
[2024-06-10] - Add papers of NeurIPS 2023, ICCV 2023, CVPR 2023, ICLR 2023
[2024-05-24] - Add papers of ICLR 2024, CVPR 2024, ICCV 2023

Papers

Sort in descending chronological order, and conference event date.

Venue	Key Name	Title	Code
2024 CVPR	TC-MoA	Task-Customized Mixture of Adapters for General Image Fusion	Link
2024 CVPR	MLoRE	Multi-Task Dense Prediction via Mixture of Low-Rank Experts	Link
2024 CVPR	MoE-Adapters4CL	Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters	Link
2024 CVPR	Omni-SMoLA	Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts	N/A
2024 ICLR	LLMCarbon	LLMCarbon: Modeling the End-to-End Carbon Footprint of Large Language Models	Link
2024 ICLR	Soft MoE	From Sparse to Soft Mixtures of Experts	Link
2024 ICLR	MC-SMoE	Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy	Link
2024 ICLR	Mowst	Mixture of Weak and Strong Experts on Graphs	Link
2024 ICLR	MoV	Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning	Link
2024 ICLR	PI-HC-MoE	Scaling physics-informed hard constraints with mixture-of-experts	Link
2024 ICLR	MoLE	Mixture of LoRA Experts	Link
2024 ICLR	TESTAM	TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of Experts	Link
2024 ICLR	MOORE	Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts	Link
2024 ICLR	HSQ	Hybrid Sharing for Multi-Label Image Classification	Link
2024 ICLR	FLAN-MOE	Mixture-of-Experts Meets Instruction Tuning: A Winning Combination for Large Language Models	N/A
2024 ICLR	Lingual-SMoE	Sparse MoE with Language Guided Routing for Multilingual Machine Translation	Link
2023 NeurIPS	ShiftAddViT	ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer	Link
2023 NeurIPS	RAPHAEL	RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths	N/A
2023 NeurIPS	DAMEX	DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets	Link
2023 NeurIPS	MoE-IMP	Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception	N/A
2023 ICCV	AdaMV-MoE	AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts	Link
2023 ICCV	MoE-Fusion	Multi-Modal Gated Mixture of Local-to-Global Experts for Dynamic Image Fusion	Link
2023 ICCV	PnD	Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts	Link
2023 ICCV	TaskExpert	TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts	Link
2023 ICCV	GNT-MOVE	Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts	Link
2023 ICCV	ADVMoE	Robust Mixture-of-Expert Training for Convolutional Neural Networks	Link
2023 ICML	pMoE	Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks	Link
2023 CVPR	ERNIE-ViLG 2.0	ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model With Knowledge-Enhanced Mixture-of-Denoising-Experts	N/A
2023 CVPR	Mod-Squad	Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners	Link
2023 ICLR	GMoE	Sparse Mixture-of-Experts are Domain Generalizable Learners	Link
2023 ICLR	SMoE-Dropout	Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers	Link
2023 ICLR	KiC	Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models	N/A
2023 ICLR	MoCE	Task-customized Masked Autoencoder via Mixture of Cluster-conditional Experts	N/A
2023 ICLR	SCoMoE	SCoMoE: Efficient Mixtures of Experts with Structured Communication	Link
2023 ICLR	Switch-NeRF	Switch-NeRF: Learning Scene Decomposition with Mixture of Experts for Large-scale Neural Radiance Fields	Link
2022 NeurIPS	N/A	Towards Understanding the Mixture-of-Experts Layer in Deep Learning	Link
2022 NeurIPS	Uni-Perceiver-MoE	Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs	Link
2022 NeurIPS	LIMoE	Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts	N/A
2022 NeurIPS	TA-MoE	TA-MoE: Topology-Aware Large Scale Mixture-of-Expert Training	N/A
2022 NeurIPS	Meta-DMoE	Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-Experts	Link
2022 NeurIPS	M³ViT	M³ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design	Link
2022 NeurIPS	SMoE	Spatial Mixture-of-Experts	Link
2022 NeurIPS	MoE-NPs	Learning Expressive Meta-Representations with Mixture of Expert Neural Processes	N/A
2022 NeurIPS	VLMo	VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts	Link
2022 NeurIPS	X-MoE	On the Representation Collapse of Sparse Mixture of Experts	N/A
2022 ICML	NID	Neural Implicit Dictionary Learning via Mixture-of-Expert Training	Link
2022 ICML	GLaM	GLaM: Efficient Scaling of Language Models with Mixture-of-Experts	N/A
2022 ICML	DeepSpeed-MoE	DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale	Link

Note: there are no papers of MoE topic in 2022 ECCV, 2022 CVPR and 2022 ICLR.

Other

Please do not distribute this list without permission. Thank you.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome-MoE

News

Papers

Other

About

Releases

Packages

Oliver-FutureAI/Awesome-MoE

Folders and files

Latest commit

History

Repository files navigation

Awesome-MoE

News

Papers

Other

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages