GitHub - mayankbaluni/CaliHousingPredictor: Efficient Linear Regression Analysis for California Housing Market Prediction

California Housing Price Prediction with Linear Regression

This repository contains a simple and efficient implementation of a Linear Regression model for predicting housing prices in California using the popular California Housing dataset.

Overview

The code performs the following steps:

Imports necessary libraries: numpy, sklearn.datasets, sklearn.model_selection, sklearn.linear_model, sklearn.metrics
Fetches the California Housing dataset: Using fetch_california_housing from sklearn.datasets.
Splits the data: Splits the data into training and testing sets with 80% training data and 20% testing data using train_test_split from sklearn.model_selection, ensuring a random state of 42 for reproducibility.
Initializes and trains Linear Regression model: Uses LinearRegression from sklearn.linear_model to create a model and trains it on the training data.
Predicts housing prices: Predicts the housing prices for the testing data using the trained model.
Evaluates model performance: Calculates R2 Score and Mean Squared Error (MSE) using r2_score and mean_squared_error from sklearn.metrics to assess the model's accuracy and error.
Prints results: Displays the R2 Score and MSE values.

Running the code

Clone this repository.
Open a terminal in the project directory.
Run the following command:

python california_housing_lr.py

You can install these dependencies with the following command:

pip install -r requirements.txt

This will print the R2 Score and Mean Squared Error for the model.

Results

The R2 Score and Mean Squared Error will vary slightly due to random splitting of the data. However, you can expect to see an R2 Score around 0.57 and an MSE around 5.5.

Interesting Inferences

While the Linear Regression model achieves decent accuracy (R2 ~ 0.6), there's still room for improvement. Exploring other models or feature engineering could further enhance prediction accuracy.
The MSE suggests an average error of around $5,200 in price predictions. This may be acceptable for some applications, but for others, a more precise model might be necessary.

Further exploration

Try different machine learning models like Random Forest or Gradient Boosting and compare their performance to the Linear Regression model.
Experiment with feature engineering techniques like scaling or adding interaction terms to improve the model's ability to capture complex relationships between features.
Analyze the model's coefficients to understand which features have the strongest impact on housing prices.

This project serves as a starting point for exploring housing price prediction with machine learning. Feel free to experiment further and contribute your findings!

License

This code is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
README.md		README.md
accuracy.png		accuracy.png
banner.png		banner.png
california_housing_lr.py		california_housing_lr.py
command.png		command.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

California Housing Price Prediction with Linear Regression

Overview

Running the code

Results

Interesting Inferences

Further exploration

License

About

Releases

Packages

Languages

mayankbaluni/CaliHousingPredictor

Folders and files

Latest commit

History

Repository files navigation

California Housing Price Prediction with Linear Regression

Overview

Running the code

Results

Interesting Inferences

Further exploration

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages