MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal Open-domain Conversation

About the dataset

A Dialogue Case of MMDialog:

Statistics:

Dataset Folder Format:

File: conversations.json

Note:

train set do not contains "negative_candidate_media_keys" and "negative_candidate_texts", which only exists in test and valid set. Each "negative_candidate_xxx" contains 999 negative candidates for retrieval task.
Words like :smiling_face_with_smiling_eyes: and :raising_hands: are emotion tokens, we will share you the mapping between these tokens and real emotions.

Who it's for: You are either a master’s student, doctoral candidate, post-doc, faculty, or research-focused employee at an academic institution or university.
Non-commercial use: You should only use this access for non-commercial purposes.
Clearly Plan: You have a clearly defined research objective, and you have specific plans for how you intend to use and analyze this data from your research.
Promise your behavior: You should promise you would not share this dataset without our qualification review and permission.

If you don't meet all of the requirements above, we would not share you the dataset.

Item	Description
Your Role	[master’s student / doctoral candidate / post-doc / faculty / research-focused employee / others]
Your study or work organization	e.g. Microsoft Research, DeepMind, Cornell University, ...
Your Academic Homepage	Your [Google Scholar] or [Homepage_URL running on your organization website (e.g. yourname.people.xxx.edu / yourname.xxx.people.msr.microsoft)]
Non-commercial Use	You [promise / cannot promise] that you will not apply data to commercial scenarios or products.
Sharing Limitation	You [promise / cannot promise] you would not share this dataset without our qualification review and permission.
Your Plan	(Describe your research plan and how you intend to use and analyze this data from your research. >= 50 words)

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
ADialogueCase.PNG		ADialogueCase.PNG
ConvCase.PNG		ConvCase.PNG
DatasetStatistics_1.PNG		DatasetStatistics_1.PNG
DatasetStatistics_2.PNG		DatasetStatistics_2.PNG
DatasetTree.PNG		DatasetTree.PNG
Readme.md		Readme.md