DeBERTa pre-trained models
Pre-release
Pre-release
DeBERTa pre-trained models
-
base is the model with same config as BERT base, e.g. 12 layers, 12 heads, 768 hidden dimension
-
base_mnli is the base model fine-tuned with MNLI data
-
large is the model with same config as BERT large, e.g. 24 layers, 16 heads, 1024 hidden dimension
-
large_mnli is the large model fine-tuned with MNLI data
-
xlarge is the model with 48 layers, 16 heads, 1024 hidden dimension
-
xlarge_mnli is the xlarge model fine-tuned with MNLI data
-
bpe_encoder is the GPT2 vocabulary package