You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Data Preprocessing, Logistic Regression and AdaBoost from scratch
In this assignment, we had to implement logistic regression and adaboost from scratch and train them on 3 different datasets. The datasets were preprocessed and the models were trained and tested on them. The task was to
Preprocess the data
Remove outliers
Handle missing values
Handle imbalanced data
Scale the data
Encode the categorical data
Implement logistic regression and adaboost from scratch
Train and test the models on the datasets
With Logistic Regression
With AdaBoost (Ensemble of weak Logistic Regression models)
Compare the results with the results Between the Logistic Regression and AdaBoost model
When to use min-max scaling and when to use standard scaling
When to use one hot encoding and when to use label encoding
if test data does not hold a categorical value that is present in the training data, one hot encoding will fail
as is the case with {'native-country_Holand-Netherlands'} in the adult dataset
Removing outliers and handling imbalanced data
The outliers in the Credit Card dataset should not be removed because they are most likely the frauds that we are trying to detect