text_mining_spanish

Text Mining and sentiment with R using tidytext package - coursework (in spanish)

Coursework for the Master's Degree in Big Data and Business Analytics U.N.E.D. Since the syllabus is in Spanish all the code and comments are left in Spanish.

The coursework consists in an RMarkdown script generating an HTML document. The document contains.

Introduction, explaining the tidy philophy for the NLP package . Which is a tidy (as per H. Wickham tidyverse) alternative of the popular package.
The script reads a Kaggle dataset containing the top 25 headlines of 1989 dates (from 2008 to 2017) in the Reddit r/worldnews
After cleaning and wrangling the data, I carried out a simple "static" sentiment analysis -i.e. not analysing the overall sentiment of the headlines throught time. This could be a future line of work to create knowledge from a temporal variation of the headlines. (Seasonality of bad sentiment due to natural disasters, wars, general elections, financial crisis, etc).
An additional line of future work is inspired by the Kaggle Competition by Two Sigma "Using News to predict stock movements" https://www.kaggle.com/c/two-sigma-financial-news.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
.gitignore		.gitignore
README.md		README.md
index.Rmd		index.Rmd
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

text_mining_spanish

About

Uh oh!

Releases

Packages

Uh oh!

Languages

a-valvaq-2086/text_mining_spanish

Folders and files

Latest commit

History

Repository files navigation

text_mining_spanish

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages