The official implementation of the paper "GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory".
To install the required environment, execute pip install -r requirements.txt.
transformers is the code we used to conduct experiments on Switch Transformer.
You can choose a task from ./tasks and run the corresponding bash script to conduct experiments. Before executing the following commands, make sure to set the dataset and model names or paths in the corresponding script.
For text classification:
# Run GWMoE on Switch Transformer for text classification:
bash ./tasks/text-classification/run_glue.shFor summarization:
# Run GWMoE on Switch Transformer for summarization:
bash ./tasks/summarization/run_summarization.shFor question-answering:
# Run GWMoE on Switch Transformer for question-answering:
bash ./tasks/question-answering/run_seq2seq_qa.shMore code is still being organized, and we will update it later.
If you find our work helpful, please cite our paper:
@misc{wu2024gwmoe,
title={GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory},
author={Haoze Wu and Zihan Qiu and Zili Wang and Hang Zhao and Jie Fu},
year={2024},
eprint={2406.12375},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Our codebase is built on Transformers and JetMoE.
This source code is released under the MIT license, included here.