Regression Project

In this case is possible to work through a study regression predictive modeling problem in Python including each step of the applied machine learning process. Some steps of the process:

How to use data transforms to improve model performance.
How to use algorithm tunning to improve model performance.
How to use ensemble methods and tunning of ensemble methods to improve model performance.

Problem Definition

For this project the Boston House Price dataset was investigated. Each record in the dataset describes a Boston suburb or town. The data was drawn from the Boston Standard Metropolitan Area (SMSA) in 1970.

The attributes are defined as follows (taken from the UCI Machine Learning Repository: http://lib.stat.cmu.edu/datasets/boston):

CRIM: per capita crime rate by town
ZN: proportion of residencial land zoned for lots over 25,000 sq.ft.
INDUS: proportion of non-retail business acres per town
CHAS: Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
NOX: nitric oxides concentration (parts per 10 million)
RM: average number of rooms per dwelling
AGE: proportion of owner-occupied units built prior to 1940
DIS: weighted distances to five Boston employment centers
RAD: index of accessibility to radial highways
TAX: full-value property-tax rate per $10,000
PTRATIO: pupil-teacher ratio by town
B: 1000(Bk − 0:63)2 where Bk is the proportion of blacks by town
LSTAT: % lower status of the population
MEDV: Median value of owner-occupied homes in $1000s

Summary of the Jupyter Notebook

Problem Definition (Boston house price data).
Loading the Dataset.
Analyze Data (some skewed distributions and correlated attributes).
Evaluate Algorithms (Linear Regression looked good).
Evaluate Algorithms with Standardization (KNN looked good).
Algorithm Tuning (K=3 for KNN was best).
Ensemble Methods (Bagging and Boosting, Gradient Boosting looked good).
Tuning Ensemble Methods (getting the most from Gradient Boosting).
Finalize Model (use all training data and confirm using validation dataset)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Housing.ipynb		Housing.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Regression Project

Problem Definition

Summary of the Jupyter Notebook

About

Releases

Packages

Languages

MariaClaraMendes/Regression-Project

Folders and files

Latest commit

History

Repository files navigation

Regression Project

Problem Definition

Summary of the Jupyter Notebook

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages