This simple project attempts to scape product listings data from e-commerce sites and visualise them using Tableau.
Web Scraping is the process of collecting information from a particular website by downloading the html contents of the site and then parsing that data to extract the data of interest. Some of the main use cases of web scraping include information gathering, market research and etc. Web scraping is used by individuals or businesseses who want to make use of publicly available web data to make smarter decisions.
- Import selenium, pandas, time, and the re modules
- Define the path of the chromedriver.exe
- Open Chrome(or other browsers) using the webdriver imported from Selenium.
- Proceed to the webpage ''https://www.lazada.com.my/shop-mobiles/apple/'
- Find the xpath of the title and price of each listing. Also, find the xpath of the "next page" button.
- Loop the web scraping process based on the total number of pages displayed on the website.
- Convert the newly created list with all the listing's title and pricing details into a DataFrame using pandas.
- Define a function to remove emojis from the listing's title and apply it in the DataFrame.
- Export the DataFrame into csv.
Click Here To View The Dashboard
- There are a total 124 Apple smartphone's listings on Lazada
- Iphone 14 has the most number of listings (35 units)
- The cheapest listed apple phone is "Iphone 5S - 16GB (Demo)" and the most expensive apple smartphone listed is "Apple Iphone 13 Pro Max 1 TB"
- The median price for Iphone 14 regardless of price and storage space is RM 4,699. The cheapest price for this model is RM 3,899 and the highest price is RM 5,799.
- The median price for Iphone 13 regardless of price and storage space is RM 3,899. The lowest price listed for Iphone 13 is RM 3,199 and the most expensive phone is listed for RM 7,269.
- There are only one iPhone XS and one iPhone 5S listed on Lazada.
- There are two listings of Apple Iphone 12 which are located outside the whisker of the boxplot. This indicates that these 2 listings are outliers and they are highly likely to have the wrong pricing.


