This code is the official pytorch implementation of ACL2021 paper: Comprehensive Study: How the Context Information of Different Granularity Affects Dialogue State Tracking? Puhai Yang, Heyan Huang, Xian-Ling Mao. ACL2021 (Long paper) [arXiv]
Dialogue state tracking (DST) plays a key role in task-oriented dialogue systems to monitor the user's goal. In general, there are two strategies to track a dialogue state: predicting it from scratch and updating it from previous state. The scratch-based strategy obtains each slot value by inquiring all the dialogue history, and the previous-based strategy relies on the current turn dialogue to update the previous dialogue state. However, it is hard for the scratch-based strategy to correctly track short-dependency dialogue state because of noise; meanwhile, the previous-based strategy is not very useful for long-dependency dialogue state tracking. Obviously, it plays different roles for the context information of different granularity to track different kinds of dialogue states. Thus, in this paper, we will study and discuss how the context information of different granularity affects dialogue state tracking. First, we explore how greatly different granularities affect dialogue state tracking. Then, we further discuss how to combine multiple granularities for dialogue state tracking. Finally, we apply the findings about context granularity to few-shot learning scenario. Besides, we have publicly released all codes.
- python 3.6
- pytorch >= 1.0
MGL_SpanPtr, MGL_TRADE, MGL_BERTDST, MGL_SOMDST: baselines with granularity, which are reproduced based on the original papers [SpanPtr, TRADE, BERTDST, SOMDST] and the official pytorch implementation of SOMDST.
MGL_SUMBT: baseline with granularity, which is reproduced based on the original paper [SUMBT] and the official pytorch implementation of SUMBT.
-
Corpus download
Sim-M and Sim-R: download,
WOZ2.0: download
DSTC2: download
MultiWOZ2.1: download
-
Data preprocessing
python create_data_DSTC2.py python create_data_MultiWOZ.py
MGL_SpanPtr, MGL_TRADE, MGL_BERTDST, MGL_SOMDST: unzip the dataset.zip file and copy it to the corresponding MGL_* folder.
MGL_SUMBT: The processed data has been included in its data folder, and you can reprocess the data by yourself according to the instructions.
MGL_SpanPtr, MGL_TRADE, MGL_BERTDST, MGL_SOMDST:
# For example:
bash SOMDST_train_SG.sh # train SOMDST with single granularity
bash SOMDST_train.sh # train SOMDST with Multiple granularities
MGL_SUMBT:
# For example:
bash run-multiwoz.sh # train SUMBT with Multiple granularities on MultiWOZ2.1
Contact: Puhai Yang (phyang@bit.edu.cn
), Heyan Huang (hhy63@bit.edu.cn
), Xian-Ling Mao (maoxl@bit.edu.cn
)