Skip to content

Logistic Regression model built using a housing dataset to predict whether a house is high-priced or not based on features like area, number of bedrooms, bathrooms, and stories. The project includes a comparison between models trained on unscaled vs. scaled data, demonstrating the effect of feature scaling on model performance.

Notifications You must be signed in to change notification settings

vinaykumar2331/housing-logistic-regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 

Repository files navigation

🏠 Housing Price Classification using Logistic Regression

This project applies Logistic Regression to a housing dataset to classify whether a house is high-priced or not based on features like area, number of bedrooms, bathrooms, and stories.
The model is trained on both unscaled and scaled data to compare performance and demonstrate the importance of feature scaling in classification problems.


πŸ“ Dataset

  • File: Housing.csv
  • Target Variable: HighPrice (binary: 1 = high-priced, 0 = not high-priced)
  • Features Used:
    • area
    • bedrooms
    • bathrooms
    • stories

If HighPrice is not already binary, it is generated by comparing prices to the median:

df['HighPrice'] = (df['price'] > df['price'].median()).astype(int)

🧠 Machine Learning Model

  • Model: Logistic Regression (scikit-learn)
  • Scaler Used: StandardScaler (for scaled version)
  • Data Split: 80% training, 20% testing
  • Evaluation:
    • Accuracy score
    • Classification report (Precision, Recall, F1-Score)

βš–οΈ Model Performance: Scaled vs. Unscaled

Metric Unscaled Model Scaled Model
Accuracy 0.76 0.77
Precision (Class 0) 0.74 0.75
Recall (Class 0) 0.92 0.91
F1-Score (Class 0) 0.82 0.82
Precision (Class 1) 0.83 0.81
Recall (Class 1) 0.53 0.58
F1-Score (Class 1) 0.65 0.68
Macro Avg F1 0.73 0.75
Weighted Avg F1 0.75 0.76

βœ… Observation: Scaling slightly improved overall performance and macro-average F1-score, particularly improving recall for high-priced properties (Class 1).


πŸ’» How to Use This Notebook

  1. Open the Jupyter Notebook: housing_model.ipynb
  2. Run each cell in order to:
    • Load and preprocess the data
    • Train logistic regression models (scaled and unscaled)
    • Evaluate and compare the results

πŸ“Œ You can run the notebook in:

  • Jupyter Lab
  • Jupyter Notebook
  • VS Code (with Python + Jupyter extensions)

πŸ“¦ Requirements

Install required Python libraries before running the notebook:

  • pandas
  • numpy
  • scikit-learn

🏷️ Tags

logistic-regression housing-data machine-learning binary-classification scikit-learn feature-scaling standardscaler data-analysis predictive-modeling python real-estate classification-model

About

Logistic Regression model built using a housing dataset to predict whether a house is high-priced or not based on features like area, number of bedrooms, bathrooms, and stories. The project includes a comparison between models trained on unscaled vs. scaled data, demonstrating the effect of feature scaling on model performance.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published