The dataset and related files of the study Not Good Times for Lies: Misinformation Detection on the Russia-Ukraine War, COVID-19, and Refugees is published under this repository.
The dataset is composed of 10,348 tweets: 5,284 for English and 5,064 for Turkish. Tweets in the dataset cover different topics: the Russia-Ukraine war, COVID-19 pandemic, Refugees, and additional miscellaneous events. Three misinformation label of the tweet are also given. Since we follow Twitter's Terms and Conditions, we publish tweet IDs not the tweet content directly. Explanations of the columns of the file are as follows:
Column Name | Description |
---|---|
Topic | Topic of the tweet: Ukraine, Covid, Refugees or Misc |
Event | Event of the tweet: EN01-EN40 in English and TR01-TR40 in Turkish |
Label | Label of the tweet: True, False, or Other |
Tweet_id | Twitter ID of the tweet |
Distribution of tweet counts in the dataset is as follows:
Lang | Topic | True | False | Other | Total |
---|---|---|---|---|---|
EN | Ukraine Covid Refugees Misc Total |
320 167 94 146 727 |
393 514 328 494 1,729 |
618 663 796 751 2,828 |
1,331 1,344 1,218 1,391 5,284 |
TR | Ukraine Covid Refugees Misc Total |
129 190 61 289 669 |
338 558 202 634 1,732 |
477 816 298 1,072 2,663 |
944 1,564 561 1,995 5,064 |
If you make use of this dataset, please cite following paper.
@misc{toraman2022good,
title={Not Good Times for Lies: Misinformation Detection on the Russia-Ukraine War, COVID-19, and Refugees},
author={Cagri Toraman and Oguzhan Ozcelik and Furkan Şahinuç and Fazli Can},
year={2022},
eprint={2210.05401},
archivePrefix={arXiv},
primaryClass={cs.SI}
}