David Garcia, 2023
Welcome to the online materials for Social Data Science.
Social Data Science is an emerging field that studies human behavior and social interaction through digital traces. The revolution in measurement brought by our digital society gives us data at global scales, very high frequencies, and unprecedented levels of depth and resolution.
This course focuses both on the fundamentals and applications of Data Science in the Social Sciences, including technologies for data retrieval. Students of Social Data Science learn how to plan, execute, and interpret complete Data Science projects to address questions about human behavior. After this course, students will know how to gather data from social media, search trends, and other online and offline sources, how to process and store that data, and how to combine, analyze, and visualize data to address specific questions. The course makes a special emphasis in interpretation and critique of Data Science in the Social Sciences, aiming at an interdisciplinary approach that can inform students from various disciplines.
I am the Professor for Social and Behavioral Data Science at the University of Konstanz. You can find more about my research group here: http://dgarcia.eu. My background is Computer Science but I worked my whole career with psychologists, sociologists and physicists to learn new ways to understand human behavior. I got my PhD from ETH Zurich in 2012 and a habilitation in 2018, starting to work as full professor TU Graz in 2020 and then at the University of Konstanz in 2022. To learn more about my research, check my publications.
The course is organized as a block course in five days with several topics each. There is an R crash course and four exercises for you to apply what you learned in the block. In exercises, you collect your own data to answer Social Data Science questions. The online materials do not contain the solutions to the exercises, but if you are stuck or want to start from an easier point, in the github folder of the exercise you can find a version of the exercise with hints in the form of parts of the code of the solution.
-
Introduction to Social Data Science
1.1. What is Social Data Science? -- [Slides]
1.2. SDS Story: Google Flu Trends -- [Slides]
1.3. Measuring temporal orientation with Google Trends -- [Slides]
1.4. Measuring correlation -- [Slides]
1.5. R crash course -- [R crash course materials]
1.6. Accessing the World Development Indicators from R - [Tutorial files]
1.7. Google Trends data in R - [Tutorial files]
Exercise 1: Future orientation and economic development -- [Exercise materials] -
Social dynamics
2.1. Social Impact Theory -- [Slides]
2.2. The Simmel Effect -- [Slides]
2.3. SDS Story: Baby name trends -- [Slides]
2.4. Linear regression -- [Slides]
2.5. Bootstrapping -- [Tutorial files]
2.6. Data wrangling with dplyr -- [Tutorial files]
2.7. The Twitter API in R -- [Tutorial files]
Exercise 2: Division of impact on Twitter -- [Exercise materials] -
Computational Affective Science
3.1. Measuring emotions -- [Slides]
3.2. Unsupervised sentiment analysis -- [Slides]
3.3. SDS Story: Emotions in pagers after 9/11 -- [Slides]
3.4. Supervised sentiment analysis -- [Slides]
3.5. The Semantic Differential -- [Slides]
3.6. Running unsupervised sentiment analysis in R -- [Tutorial files]
Exercise 3: Evaluating sentiment analysis methods -- [Exercise materials] -
Social network analysis
4.1. Introduction to social networks -- [Slides]
4.2. The Friendship paradox -- [Slides]
4.3. SDS story: sampling opinions on Twitter -- [Slides]
4.4. Centrality in social networks -- [Slides]
4.5. Handling network data in R -- [Tutorial files]
4.6. Twitter network data -- [Tutorial files]
Exercise 4: Assortativity among Swiss politicians on Twitter -- [Exercise materials] -
Social network phenomena
5.1. Social resilience -- [Slides]
5.2. SDS story: the death of social networks -- [Slides]
5.3. Structural holes and communities -- [Slides]
5.4. Assortativity -- [Slides]
5.5. Permutation tests -- [Tutorial materials]
5.6. Network analysis in R -- [Tutorial materials]
- Handouts, codes, and data can be found on the Github repository of the course.
- Students at ETH Zurich can access the course moodle to get information about evaluation criteria for the course and to participate in online quizzes.