Skip to content

Mehrab-Kalantari/News-Popularity-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

News Popularity Prediction

Dataset on kaggle

Contents

Data understanding and EDA

  • Histogram plot
  • Data queries
  • Box plot
  • Correlation matrix

Hypothesis tests for a better understanding

  • Pearson correlation test
  • Spearman correlation test
  • Kendall-tau correlation test
  • T test
  • Z test

Data preprocessing and feature selection

  • Missing data values
  • Categorical to numerical
    • OHE
  • Outlier detection
    • K-sigma method
  • Feature scaling
    • Standard scaling
    • Min-max normalization
    • Robust scaling
  • Feature selection
    • Forward selection
    • Backward selection
  • Feature extraction
    • PCA

Modeling (Regression)

  • Linear regression
  • Polynomial regression
  • Ridge regression
  • Lasso regression

Evaluation

  • R2 score