Scraping Real Estate Properties

This web scraping project extracts real estate properties data from a given location by the API endpoint method.

All the data retrieved is extracted to an excel workbook and inserted to a database which is previously created.

More in depth...

The API endpoint method directly gets the JSON data that is being sent from the server by making a request.

I've used Insomnia to check the query parameters needed to make the request and to get a preview of the response.

This project is divided in different files:

API Endpoint Method

It's the main file. All the query parameters of the locations chosen are stored here, if you want to scrape another location, the parameters of this location will need to be added.
Request

It makes a GET request for the location given (by the query parameters), returning a json for each page.
Load Json

This file consists of two methods:
- open_json
It opens and loads a json from the raw response exported from insomnia, instead of making a request. This method is created for test purposes, to avoid bombarding the web with requests.
- open_json_request
It makes a request and loads the json response.
Extract Data

It extracts from the jsondata:
- Url of the property
- Mobile
- Real Estate Agency if apply
- Type ID (to know if an agency is linked)
- Date
- Real Estate ID
- Price
- Transaction Type ID (to know if the property is open for sell or rent)
- Location
Exports

This file consists of two methods:
- export_csv
It exports the data retrieved to .csv file
- export_excel
It exports the data retrieved to .xlsx file
Database

It creates a database connection to the SQLite database specified by a databse file, creates a table if not created and inserts the properties extracted.

Quickstart

Fork and Clone this repository and navigate into it

cd WebScraping

Install the dependencies

pip install -r requirements.txt

Run the script

python3 API_endpoint_method.py

Name		Name	Last commit message	Last commit date
Latest commit History 162 Commits
flask-app		flask-app
.gitignore		.gitignore
.travis.yml		.travis.yml
API_endpoint_method.py		API_endpoint_method.py
Pipfile		Pipfile
Procfile		Procfile
README.md		README.md
db.py		db.py
exports.py		exports.py
extract_data.py		extract_data.py
fotocasa-scraper.db		fotocasa-scraper.db
fotocasa_1_alcobendas.json		fotocasa_1_alcobendas.json
load_json.py		load_json.py
request.py		request.py
requirements.txt		requirements.txt
selenium_links.py		selenium_links.py
stats.py		stats.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Scraping Real Estate Properties

More in depth...

Quickstart

About

Uh oh!

Releases

Packages

Uh oh!

Languages

JuliaSerrano/WebScraping

Folders and files

Latest commit

History

Repository files navigation

Scraping Real Estate Properties

More in depth...

Quickstart

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages