This project investigates how different learning formats—specifically academic papers and documentaries—affect participants’ knowledge retention and attitudes toward immigrants, refugees, and climate change. The study is implemented as a randomized controlled trial (RCT) with four treatment groups and one control group:
- Short Paper
- Documentary Video
- Paper followed by Video
- Video followed by Paper
- Control Group (no intervention)
The analysis involves multiple components:
- 📊 PCA (Principal Component Analysis): Used to create indices from survey responses.
- 🧠 Open-Ended Text Analysis: Responses to open-ended questions are analyzed using OpenAI's API and supplemented with machine learning techniques to classify themes and sentiment.
- 📈 Quantitative Analysis: Includes descriptive statistics, balance tables, consistency checks, and regressions to evaluate treatment effects.
A second stage of the study is currently underway to examine how responses change over time.
To reproduce the full analysis, execute the scripts in the following order:
experiment_cleaning.R— Cleans and preprocesses the raw data.experiment_mapping.R— Maps treatments and sets up survey structure.text_analysis.py— Applies ML-based classification to open-ended responses.experiment_pca.R— Runs PCA and builds indices.experiment_open_ended.py— Uses OpenAI's API to analyze open-ended survey responses.experiment_clean_llm_results.R— Cleans and integrates LLM-generated analysis.experiment_info.R— Final stage of data analysis and visualization.
The Regressions/ folder contains several regression models used for estimating treatment effects and conducting robustness checks.
- R Packages:
readxl,dplyr,writexl,gtsummary,haven,httr,jsonlite,stringr,purrr,tidyr - Python Packages (for text analysis):
pandas,openai,scikit-learn,nltk, etc.
Make sure to set your OpenAI API key properly in the environment before running experiment_open_ended.py.