Discover the hidden patterns behind what drives house prices!
This project performs an in-depth exploratory data analysis (EDA) on the Kaggle House Prices - Advanced Regression Techniques dataset using Python and popular data science libraries.
- Rows: 1,460 Β | Β Columns: 81
- Source: Kaggle Competition
- Target Variable:
SalePrice
-
git clone https://github.com/Rivu5555/House-Regression.git cd house-prices-eda
-
Install dependencies: pip install -r requirements.txt
-
Run the Notebook: jupyter notebook house_prices_eda.ipynb
If the data file is not present, download it here and place it in the
input/folder.
-
Missing Value Analysis:
Bar plots to visualize missing data columns. -
Key Feature Relationships:
-
GarageAreavsSalePrice -
OverallQualvsSalePrice -
SaleTypeimpact -
Visual Insights:
-
Histograms, KDE plots, Boxplots
-
Scatter plots for feature relationships
-
Major Findings:
-
Higher
OverallQualandGarageAreaoften predict higherSalePrice -
Certain
SaleTypes are linked to price outliers -
Notable missing data in some categorical features
-
Python 3.8+
-
pandas
-
numpy
-
matplotlib
-
seaborn
Pull requests are welcome! For major changes, open an issue first to discuss what you would like to change.
Happy Analyzing! π π
