Skip to content

andrinr/uzh-data-science

 
 

Repository files navigation

Newspaper article categorization

A data science project

During this project we have tried to analyze a dataset obtained from huffpost containing roughly 200'000 newspaper articles, including headline, description, category and publication date. Using the headline and/or description words we tried to predict the human labeled categories using word embeddings and common machine learning models.


This project was created in collaboration with Michael Hodel (https://github.com/michaelhodel) during the lecture Introduction to Data Science offered at University of Zürich.

The data set used can be found here: https://www.kaggle.com/rmisra/news-category-dataset.

About

Introduction to Data Science final project documentation, notebooks and presentation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%