The implementation of DeBERTaV3-based commonsense question answering on CommonsenseQA.
python=3.8.5
numpy=1.20.1
torch=1.9.1+cu102
transformers=4.10.0
tqdm=4.62.2
The initial DeBERTaV3:
@misc{he2021debertav3,
title={DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing},
author={Pengcheng He and Jianfeng Gao and Weizhu Chen},
year={2021},
eprint={2111.09543},
archivePrefix={arXiv},
primaryClass={cs.CL}
}