This repo contains resourceses for the Digital Literacy project 2018 @ Aarhus University
- smurf extractor
- webscraber and twitter scraper (for historical data)
- language detection and translation
- lemmatizer, POS-tagger and dependency parser for contemporary Danish
- POS-tagger for (contemporary) Scandinavian languages
- CBOW in TensorFlow for training word embeddings (requires GPU acceleration)
- keyword in context for n files (collocations for concordance)
- topic/content analysis tool (LDA and guided LDA)
- DMI-TCAT (To get started see the introduction video and read the paper by DMI on analyzing the social in twitter data)