YouTube Scraper with Scrapy

NOTE: This scraper is only meant for educational use. For commercial utilization, Please use Youtube API.

Description

Web Scraping has its major value in data mining and data visualization field. With the global web filled with huge data publicly available, there is need to extract such data in a presentable way. That's where the data scraping comes in. This repo provides with a sample web scraper written for a Youtube page in Scrapy. Given a Youtube page, the scraper yields following information:

Video Title
Views
Likes
Dislikes
Publish Date
Channel Name
Channel Subscriber count

After running the scraper, a csv file is generated with all the data. Besides CSV, a text file with extracted Youtube video links is created. The scraper also provides default values to fields that cannot be located on the page on some links.

Usage

Install Python 2.7 and scrapy. Then, download the repo and open the terminal in root folder (with scrapy.cfg and YoutubeScraper folder). For the input, there are following two methods:

input.txt
command line

For input.txt, add all the urls in the text file. There should only be one url per line. Enter the command "scrapy crawl YoutubeSpider" and all the data will be extracted and saved to data.csv file.

For CLI, use command "scrapy crawl YoutubeSpider -a url=ANY_YOUTUBE_URL". This is an example of CLI usage : "scrapy crawl YoutubeSpider -a url=https://www.youtube.com/watch?v=yy76z4u6WqE"

The scraper provides login utility to extract links based on query history. For now, the login details need to be hardcoded in the spider file.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.idea		.idea
Sample Input and Output		Sample Input and Output
YouTubeSpider		YouTubeSpider
README.md		README.md
data-master.csv		data-master.csv
input.txt		input.txt
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YouTube Scraper with Scrapy

NOTE: This scraper is only meant for educational use. For commercial utilization, Please use Youtube API.

Description

Usage

CLI has high priority than input.txt. In cases where both input methods are available, only CLI input is handled.

About

Releases

Packages

Languages

DawoudSheraz/YouTube-Scraper

Folders and files

Latest commit

History

Repository files navigation

YouTube Scraper with Scrapy

NOTE: This scraper is only meant for educational use. For commercial utilization, Please use Youtube API.

Description

Usage

CLI has high priority than input.txt. In cases where both input methods are available, only CLI input is handled.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages