This repository contains the R code and documentation for a sentiment analysis project conducted as part of the "Text as Data and Automated Content Analysis" seminar at the University of Bern (Spring Semester 2023). The project analyzes the sentiment differences between opposition and governing parties in the German Bundestag over a 72-year period.
We analyzed parliamentary speeches from the German Bundestag (1949-2021) using the "Open Discourse" dataset, applying sentiment analysis to investigate whether opposition parties use more negative language than governing parties. Our analysis includes ~870,000 parliamentary speeches and employs both sentiment analysis and topic modeling techniques.
- Data preprocessing using the Open Discourse dataset from Harvard Dataverse
- Text preprocessing and corpus creation using
quanteda - Sentiment analysis using Rauh's German Political Sentiment Dictionary
- Statistical analysis with linear regression
- Topic modeling with Latent Dirichlet Allocation (LDA) for exploratory analysis
- Time series analysis of sentiment trends
How does the sentiment score of parliamentary speeches in Germany relate to opposition and government politicians respectively?
Parties and politicians of the opposition use more negative language in their speeches in the German parliament.
- Opposition parties consistently show more negative sentiment than governing parties across most time periods
- Significant shift after 1990: Both parties became more positive post-German reunification
- Statistical significance: Linear regression confirms a significant (p < 0.001) but small effect (coefficient = 0.2)
- COVID-19 analysis: LDA topic modeling revealed surprisingly neutral sentiment around pandemic discussions in the 19th legislature
- Dataset: Open Discourse (Richter et al., 2020) - German Bundestag speeches 1949-2021
- Sentiment Dictionary: Rauh's German Political Sentiment Dictionary (validated for German political texts)
- Analysis Tools: R with quanteda, seededlda, and statistical modeling packages
- Validation: Manual review of ~50 text excerpts to verify dictionary accuracy
| File / Folder | Description |
|---|---|
Final_assignment_Simon_Bernhard.R |
Complete R script for data processing, sentiment analysis, and visualization |
Final_assignment_seminar_text_as_data.pdf |
Complete seminar paper with methodology and results |
README.md |
Project overview and documentation |
- Sentiment Score Over Time: Longitudinal analysis showing opposition vs. government sentiment trends
- COVID-19 Topic Analysis: LDA-derived topic modeling focusing on pandemic-related speeches
- Regression Analysis: Statistical relationship between party status and sentiment
- Richter, F., et al. (2020). Open Discourse. Harvard Dataverse.
- Rauh, C. (2018). Validating a sentiment dictionary for German political language. Journal of Information Technology & Politics.
- Thomas, M., Pang, B., & Lee, L. (2006). Get out the vote: Determining support or opposition from Congressional floor-debate transcripts.
- Lou Monnier | lou.monnier@students.unibe.ch
- Simon Bernhard | simon.bernhard@students.unibe.ch
- Kevin Jan Schläpfer | kevin.schlaepfer@students.unibe.ch
University of Bern, Institute of Communication and Media Studies
Spring Semester 2023