Skip to content

Latest commit

 

History

History
 
 

Movie Review Scraping

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

MOVIE REVIEW SCRAPING

AIM

To Extract the Reviews of Movies.

DESCRIPTION

Here is the Python Script which is used to extract the Reviews of Movies from IMDb. We have use requests and bs4 packages to extract the data.

PURPOSE

In this project you’ll learn about HTTP requests and how to send them using the requests package and will also learn how to extract required data from HTML pages using some simple functions of beautifulsoup module. As we know Sentimental Analysis is very popular task in Machine Learning, so I have wrote a Python script to get the data for you and perform several task on this type of NLP.

PACKAGES USED

The purpose of these packages in project

  • requests - It has been to send and recieve the request in order to fetch the data from IMDB.
  • bs4 - It has been used to extract the HTML elements from website.
  • json - json is used as helper in order to save the list of movies and its links.
  • pandas - It is used to create and store dataframes into .csv format.

Workflow

  • Import above packages mentioned above.
  • Extracting movies and links
  • After that we have extracted the reviews along with their rating.
  • Saving the data in .csv format

SETUP PACKAGES

  • pip install requests
  • pip install pandas
  • pip install bs4
  • pip install json

COMPILATION STEPS

Go to terminal

Run command : python3 scrapy_data.py

Rest the script will do the work.

SOURCE

IMDB

Image

OUTPUT

VS CODE TERMINAL

OUTPUT

AUTHOR


NAME : DEV KUMAR



GitHub : devkumar24