This project predicts house prices using Machine Learning (Linear Regression and Random Forest).
It demonstrates a complete Data Science workflow — from data analysis to model evaluation.
To build a predictive model that estimates house prices based on features like:
- Area (sq. ft)
- Number of Bedrooms
- Number of Bathrooms
- City
- Furnishing type
- Parking availability
- Importing Libraries – Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn
- Data Creation/Loading – Sample dataset with features
- Data Cleaning & Understanding –
info(),describe(), missing values - EDA (Exploratory Data Analysis) – Heatmap, Pairplot, Scatterplots
- Data Splitting – Train & Test sets
- Feature Scaling – StandardScaler
- Model Training – Linear Regression & Random Forest
- Model Evaluation – R² Score, MSE, RMSE comparison
- Feature Importance – Which features influence price most
- Model Saving – Export trained model using joblib
| Model | R² Score | MSE |
|---|---|---|
| Linear Regression | 0.9538 | 275,793,442 |
| Random Forest | 0.9834 | 98,688,840 |
✅ Random Forest performed best
🎯 Top Features: area, bedrooms
- Understanding regression algorithms
- Data preprocessing and scaling
- Model evaluation metrics (R², MSE)
- Comparing ML models
- Visualizing feature importance
- Python 🐍
- Pandas, NumPy
- Matplotlib, Seaborn
- Scikit-learn
- Joblib
- Clone this repo
git clone https://github.com/<your-username>/House-Price-Prediction.git