Author: Vivek Pandey (EJ) Contact: anton.503.overload@gmail.com
LilHomie is an early-stage project designed to predict the value of houses in the New York Tri-State Area (NY, NJ, CT). It brings together tools like web scraping, data science, machine learning, and web development to create a system that can estimate housing prices automatically.
This is a rapid prototype, meaning it’s a rough but working version built quickly to test ideas.
This repo contains all the code and tools used in the project:
-
🕷️ Web Crawler A custom-built tool that collects housing data from the internet (specifically from Trulia).
-
📊 Notebooks Jupyter notebooks that:
- Clean and prepare the data
- Analyze trends
- Train machine learning models to predict house prices
-
🧠 Machine Learning Models Trained models that are saved and ready to use for making predictions.
-
🌐 Serverless API A lightweight backend system that serves predictions from the ML models on-demand.
-
💻 Web App A basic web interface where users can input property details and get a price estimate.
Here are some improvements planned for the next versions:
- 🏗️ Add support for 3 more page formats on Trulia
- 🏡 Add support to scrape housing data from Zillow
- ⚡ Make the web crawler faster using distributed crawling (parallel spiders)
- 🌎 Expand predictions beyond the NY tri-state area to cover the entire US (after improving the crawler)
Feel free to reach out at: anton.503.overload@gmail.com