The papers were implemented in using korean corpus
- Using the Naver sentiment movie corpus v1.0 (a.k.a.
nsmc
) - Configuration
conf/model/{type}.json
(e.g.type = ["sencnn", "charcnn",...]
)conf/dataset/nsmc.json
Model \ Accuracy | Train (120,000) | Validation (30,000) | Test (50,000) | Date |
---|---|---|---|---|
SenCNN | 91.78% | 86.78% | 85.99% | 20/03/04 |
CharCNN | 85.26% | 81.34% | 81.07% | 20/03/04 |
ConvRec | 86.38% | 82.78% | 82.54% | 20/03/23 |
VDCNN | 83.85% | 82.23% | 81.61% | 20/03/24 |
SAN | 90.89% | 86.88% | 86.37% | 20/03/28 |
ETRIBERT | 91.13% | 89.18% | 88.88% | 19/10/27 |
SKTBERT | 92.39% | 88.98% | 88.98% | 19/11/10 |
- Convolutional Neural Networks for Sentence Classification (as SenCNN)
- Character-level Convolutional Networks for Text Classification (as CharCNN)
- Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers (as ConvRec)
- Very Deep Convolutional Networks for Text Classification (as VDCNN)
- A Structured Self-attentive Sentence Embedding (as SAN)
- BERT_single_sentence_classification (as ETRIBERT, SKTBERT)
- Creating dataset from https://github.com/songys/Question_pair
- Hyper-parameter was arbitrarily selected. (defined by
experiments/base_model/config.json
)
Model \ Accuracy | Train (6,136) | Validation (682) | Test (758) | Date |
---|---|---|---|---|
Siam | 93.30% | 83.57% | 84.16% | 19/10/28 |
SAN | 94.86% | 83.13% | 84.96% | 19/10/28 |
Stochastic | 88.70% | 81.67% | 81.92% | 19/11/06 |
ETRIBERT | 95.04% | 93.69% | 93.93% | 19/10/04 |
SKTBERT | 93.64% | 91.34% | 91.16% | 19/11/10 |
- A Structured Self-attentive Sentence Embedding (as SAN)
- Siamese recurrent architectures for learning sentence similarity (as Siam)
- Stochastic Answer Networks for Natural Language Inference (as Stochastic)
- BERT_pairwise_text_classification (as ETRIBERT, SKTBERT)