Skip to content

πŸ‘πŸ“ˆ A simple data mining python notebook for the 2019 AirBnb dataset

Notifications You must be signed in to change notification settings

Nikoletos-K/AirBnB-Data-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation


AirBnB Data Analysis

βœ”οΈ Open this notebook with jupyter online viewer jupyter nbviewer βœ”οΈ

Language: python > 3.0

Data: The data of this project come from AirBnB.Specifically, there's a directory named data,which contains directories Aril,March and February.These directories have information from AirBnB platform such as id,zipcode,transit,Bedrooms,Beds,Review_scores_rating,Number_of_reviews,Neighbourhood,etc.

This project consists of 2 main queries.

1st Query | Data exploration

Using simple and common functions such as groupby,sum,count,etc and plotting the results into multiple and various plots in order to make assuptions and get some results . Also,query 2 consist of 11 subqueries that are written as comments in the notebook.

2nd Query | Recommendation System

Experimenting with vectorazation (TF-IDF,BoW) and cosine similarity Isolation of columns id,name,description from given airbnb files and text processing.More specifically Q.2 consists of:

Subqueries:

  1. TF-IDF,BoW Vectorazation : Vectorazation of the three isolated columns
  2. Cosine Similarity : Calculating the similarity between every element of tf-idf with all the others
  3. Function that finds and returns the most similar entries given an id

Β© Konstantinos Nikoletos & Myrto Iglezou

About

πŸ‘πŸ“ˆ A simple data mining python notebook for the 2019 AirBnb dataset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published