To Extract the Reviews of Movies.
Here is the Python Script which is used to extract the Reviews of Movies from IMDb. We have use requests
and bs4
packages to extract the data.
In this project you’ll learn about HTTP requests and how to send them using the requests package and will also learn how to extract required data from HTML pages using some simple functions of beautifulsoup module. As we know Sentimental Analysis is very popular task in Machine Learning, so I have wrote a Python script to get the data for you and perform several task on this type of NLP.
The purpose of these packages in project
requests
- It has been to send and recieve the request in order to fetch the data from IMDB.bs4
- It has been used to extract the HTML elements from website.json
- json is used as helper in order to save the list of movies and its links.pandas
- It is used to create and store dataframes into .csv format.
- Import above packages mentioned above.
- Extracting movies and links
- After that we have extracted the reviews along with their rating.
- Saving the data in .csv format
pip install requests
pip install pandas
pip install bs4
pip install json
Go to terminal
Run command :
python3 scrapy_data.py
Rest the script will do the work.