Skip to content

manishkolla/Zillow-Home-Value-Prediction

Repository files navigation

Zillow-Home-Value-Prediction

CSC 4780/ CSC 6780 Final Class Project

Dr.Rafal. A Angryk

Made in collaboration with

1. Manish Kolla

2. Hemanth Gorrepati

3. Sumanth Chinnamudiam

Data Source: https://www.zillow.com/research/data/

google drive link for running the colab or ipynb file. https://drive.google.com/drive/folders/1lAp-y5FHtPeJnoGu4oDddtiJq9kcamgI?usp=drive_link

Abstract:

With the increase in house prices year to year causing several factors of the economy such as inflation, higher interest rates, increased expense of raw materials. We have decided to build a machine learning model which is resistant to most of the market trends using sophisticated measures of imputation and model making to decrease the error as much as possible. Some of the models we have experimented with our data are Random Forest and Linear Regression. For the data imputation, we have expermed through various methods which include replacing with the median of the state prices, KNN imputation, forward/backward filling.

Introduction:

The rapidly rising cost of housing in the United States has significant implications for the economy, driving inflation, higher interest rates, and increased expenses across various sectors. To better understand these dynamics and predict future trends, we employed data science methodologies on a comprehensive dataset from Zillow, a leading real estate platform. Through thorough data exploration and visualization, we identified intriguing patterns, including unexpected house price drops in specific states during certain periods. This sparked our interest in delving deeper and developing a model for house price prediction. Under the insightful guidance of Professor Rafael, we implemented sophisticated data processing techniques, including effective null value replacement strategies. This meticulous approach ensured the accuracy and reliability of our analysis. Our research is poised to contribute valuable insights into the complex dynamics of the US housing market, enabling informed decision-making for individuals, businesses, and policymakers alike.

Throughout our project we have worked on following the CRISP-DM life cycle using the following phases

  1. Business Understanding (Understanding the final deliverables )
  2. Data Understanding (Data Availability, Features understanding and selection )
  3. Data Processing (handling null values, molding the data, and merging them)
  4. Modeling (Implementation of several ML Algorithms on the data)
  5. Evaluation (Measuring the MAE, R2, MSE )
  6. Deployment (Implementing a prediction function with the help of ensembling)