Skip to content

A collection of AWESOME things about mixture-of-experts

Notifications You must be signed in to change notification settings

Vvsmile/awesome-mixture-of-experts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 

Repository files navigation

awesome-mixture-of-expertsAwesome

MIT License

A collection of AWESOME things about mixture-of-experts

This repo is a collection of AWESOME things about mixture-of-experts, including papers, code, etc. Feel free to star and fork.

Contents

Papers

MoE Model

Publication

  • Go Wider Instead of Deeper [AAAI2022]
  • Hash layers for large sparse models [NeurIPS2021]
  • Scaling Vision with Sparse Mixture of Experts [NeurIPS2021]
  • BASE Layers: Simplifying Training of Large, Sparse Models [ICML2021]
  • Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer [ICLR2017]
  • CPM-2: Large-scale cost-effective pre-trained language models [AI Open]

Arxiv

  • Efficient Language Modeling with Sparse all-MLP [14 Mar 2022]
  • Designing Effective Sparse Expert Models [17 Feb 2022]
  • One Student Knows All Experts Know: From Sparse to Dense [26 Jan 2022]
  • Efficient Large Scale Language Modeling with Mixtures of Experts [20 Dec 2021]
  • GLaM: Efficient Scaling of Language Models with Mixture-of-Experts [13 Dec 2021]
  • MoEfication: Conditional Computation of Transformer Models for Efficient Inference [5 Oct 2021]
  • Cross-token Modeling with Conditional Computation [5 Sep 2021]
  • Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity [11 Jan 2021]
  • Exploring Routing Strategies for Multilingual Mixture-of-Experts Models [28 Sept 2020]

MoE System

Publication

  • GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding [ICLR2021]

Arxiv

  • DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale [14 Jan 2022]
  • FastMoE: A Fast Mixture-of-Expert Training System [24 Mar 2021]

Library

About

A collection of AWESOME things about mixture-of-experts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published