This is the data story for CS-401: Applied Data Analysis at EPFL. It was made using Jekyll and the TeXt Theme.
The code utilized to create this data story can be found on the notebooks in this repo.
Gender equality is a long-fought battle, one that has already taken long strides. But has it been enough? In this project we will analyze whether there is still a fundamental difference in the way quotes from men vs women are handled by various news sources. More than simply the representativity of genders, we want to study the content of the quotes. Is there a difference in the topics men or women are cited on? What is the overall sentiment (positive or negative) associated with the quotes? What kind of language is cited? News sources have the power to influence society on a large scale, and biases can be easily propagated. Via the choice of quotes, by their content or sentiment, it is entirely possible to reflect an image (potentially distorted) of a group of people. Is that happening?
We hope to answer the following questions.
- What is the representativity of quotes by gender, and how has it evolved?
- Which are the main topics in the quotes from different genders?
- What is the sentiment (negative, neutral, positive) associated with the quotes by gender, distributed by themes? *
- What is the complexity of the speech quoted?
Furthermore, we want to study a couple of very influential websites, some liberal some conservative, and compare the sentiment and portion of the quotes by gender, to decide which ones have a more equalitarian roster of quotes.
*While quotes are inherently unchangeable, the context in which they are used, and the predominance of their sentiment, can reveal information about the sources' predisposition towards the quoted. Concretely, if a newspaper tendentially selects quotes with a negative sentiment for women, while mainly neutral/positive for men, this could be a display of an internal bias that is being propagated to the reader.
- André
- Medya
- Khanh
- Tomás