MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing

Implementation of the paper (MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing)[]

Abstract

Dialogue discourse parsing (DDP) aims to capture the relations between utterances in the dialogue. In everyday real-world scenarios, dialogues are typically multi-modal and cover open-domain topics. However, most existing widely used benchmark datasets for DDP contain only textual modality and are domain-specific. This makes it challenging to accurately and comprehensively understand the dialogue without multi-modal clues, and prevents them from capturing the discourse structures of the more prevalent daily conversations. This paper proposes MODDP, the first multi-modal Chinese discourse parsing dataset derived from open-domain daily dialogues, consisting 864 dialogues and 18,114 utterances, accompanied by 12.7 hours of video clips. We present a simple yet effective benchmark approach for multi-modal DDP. Through extensive experiments, we present several benchmark results based on MODDP. The significant improvement in performance from introducing multi-modalities into the original textual unimodal DDP model demonstrates the necessity of integrating multi-modalities into DDP.

Requirements

Pytorch >= 2.1.1

Transformers >= 4.18.0

Data Preparation

You can directly load the text data from the dataset folder and download the image and audio features from all_features.pkl.

If the link is broken or you need the original video data, please contact iainlp.suda@gmail.com.

Training

python main.py \
    --config_file ./config.cfg \
    --seed 42 \
    --postfix experiments/train \
    --text_plm_name_or_path /path/to/roberta \
    --vision_plm_name_or_path /path/to/vit \
    --audio_plm_name_or_path /path/to/wav2vec2 \
    --bert_path /path/to/bert \

Or run directly

bash run.sh

Predict and Evaluation

python main.py \
    --config_file ./config.cfg \
    --seed 42 \
    --postfix experiments/predict \
    --text_plm_name_or_path /path/to/roberta \
    --vision_plm_name_or_path /path/to/vit \
    --audio_plm_name_or_path /path/to/wav2vec2 \
    --bert_path /path/to/bert \
    --ckpt_path /path/to/best/model \
    --train False \
    --predict True \

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
dataset		dataset
module		module
script		script
utils		utils
.gitignore		.gitignore
README.md		README.md
config.cfg		config.cfg
main.py		main.py
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing

Abstract

Requirements

Data Preparation

Training

Predict and Evaluation

About

Uh oh!

Releases

Packages

Languages

gloria0108/MODDP

Folders and files

Latest commit

History

Repository files navigation

MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing

Abstract

Requirements

Data Preparation

Training

Predict and Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages