Skip to content

Latest commit

 

History

History
37 lines (28 loc) · 2.01 KB

README.md

File metadata and controls

37 lines (28 loc) · 2.01 KB

MiDe-22 Dataset

The dataset and related files of the study Not Good Times for Lies: Misinformation Detection on the Russia-Ukraine War, COVID-19, and Refugees is published under this repository.

Screenshot

Dataset

The dataset is composed of 10,348 tweets: 5,284 for English and 5,064 for Turkish. Tweets in the dataset cover different topics: the Russia-Ukraine war, COVID-19 pandemic, Refugees, and additional miscellaneous events. Three misinformation label of the tweet are also given. Since we follow Twitter's Terms and Conditions, we publish tweet IDs not the tweet content directly. Explanations of the columns of the file are as follows:

Column Name Description
Topic Topic of the tweet: Ukraine, Covid, Refugees or Misc
Event Event of the tweet: EN01-EN40 in English and TR01-TR40 in Turkish
Label Label of the tweet: True, False, or Other
Tweet_id Twitter ID of the tweet

Distribution of tweet counts in the dataset is as follows:

Lang Topic True False Other Total
EN Ukraine
Covid
Refugees
Misc
Total
320
167
94
146
727
393
514
328
494
1,729
618
663
796
751
2,828
1,331
1,344
1,218
1,391
5,284
TR Ukraine
Covid
Refugees
Misc
Total
129
190
61
289
669
338
558
202
634
1,732
477
816
298
1,072
2,663
944
1,564
561
1,995
5,064

Citation

If you make use of this dataset, please cite following paper.

@misc{toraman2022good,
      title={Not Good Times for Lies: Misinformation Detection on the Russia-Ukraine War, COVID-19, and Refugees},
      author={Cagri Toraman and Oguzhan Ozcelik and Furkan Şahinuç and Fazli Can},
      year={2022},
      eprint={2210.05401},
      archivePrefix={arXiv},
      primaryClass={cs.SI}
}