Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
XueFuzhao authored Jan 25, 2024
1 parent 37fb07d commit 86d6477
Showing 1 changed file with 6 additions and 3 deletions.
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,12 @@ This repo is a collection of AWESOME things about mixture-of-experts, including
- [Library](#library)

# Open Models

- OpenMoE: Open Mixture-of-Experts Language Models [Link](https://github.com/XueFuzhao/OpenMoE)
- Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity [Link](https://github.com/google-research/t5x/blob/main/docs/models.md)
- DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models [[Jan 2024]](https://github.com/deepseek-ai/DeepSeek-MoE) [Paper](https://arxiv.org/abs/2401.06066)
- LLaMA-MoE [[Dec 2023]](https://github.com/pjlab-sys4nlp/llama-moe)
- Mixtral of Experts [[Dec 2023]](https://mistral.ai/news/mixtral-of-experts/) [Paper](https://arxiv.org/abs/2401.04088)
- OpenMoE: Open Mixture-of-Experts Language Models [[Aug 2023]](https://github.com/XueFuzhao/OpenMoE)
- Efficient Large Scale Language Modeling with Mixtures of Experts [[Dec 2021]](https://github.com/facebookresearch/fairseq/tree/main/examples/moe_lm) [Paper](https://arxiv.org/abs/2112.10684)
- Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity [[Feb 2021]](https://github.com/google-research/t5x/blob/main/docs/models.md) [Paper](https://arxiv.org/abs/2101.03961)


# Papers
Expand Down

0 comments on commit 86d6477

Please sign in to comment.