Youtube_scrape

There was an inactivity of more than 2 years on this project. I am achieving this due to multiple reasons. Yet I will create a separate private repo for this and will work on adding new metrics and visualizations although at a slow pace. If it ends nice will merge that with this repo.

Youtube_scrape

Scrape data about an entire Channel or just a Playlist, using Youtube API. No OAuth is required.

✔️ Features

Following features are available :

create_new :
1. It creates a sqlite database to store all data.
2. Database will be placed in the same folder as the project file, named 'youtube.db'
3. It will have 4 tables - tb_channel, tb_playlist, tb_videos, video_history
4. You can use programs like DB Browser , which is lightweight, to view the database.
Oldest Video on A Topic :
1. It is an isolate program, that can be run independently.
2. It doesn't depend on main code or any database.
Scrape A Channel:
1. Allows to scrape Channel Details and it's playlists.
2. It can also scrape details for each video of that channel.
  1. If this option is not chosen, the playlist table won't have Playlist Duration.
Scrape A Single Playlist:
1. Allows to scrape info about a single Playlist and details about all it's videos.
Load Your History:
1. Make sure you have downloaded google Takeouts for your account to the PWD.
2. Make sure you have follwing path './takeout/history/watch-history.html'
3. Option to keep videos of your history on a separate table or integrate them with main table tb_videos
  1. In order to use next features, you have to integrate them.
Most Watched Video:
1. You can list your most watched 'n' videos
Early Viewed:
1. You can list 'n' videos, which you saw earliest after they were uploaded.
2. There are some discrepencies, as many videos are reuploaded after you have seen it.
  1. Program ignores those
3. It now only works when you watched it in IST.
Generate Download List:
1. This will create a text file, that will list Youtube URLs that can be downloaded by Youtube-DL or IDM etc.
2. It will select videos which are marked 'Worth = 1' i the database.
  1. This operation is to be done by the user directly on the database (using DB Browser or such)
3. There is option to list videos of a single Channel or from entire DAtabase.
4. Caution : Once a video is processed by this function, it will be marked 'Is_Downloaded = 1'. Next time this function is run, new video IDs will be considered.
  1. Hence User must make sure, all videos in download_list.txt are downloaded before rewriting the file.

💻 Setup Guide

Below is a detailed guide on setting up the environment.

Youtube API

First you need to have you Youtube API key. Below is a link of a video, that will guide you. Watch from 0:00 - 5:30

Note - Youtube API is rate limited to 10000 hits/day.
You can view your quotas at here - console
Cost of operations is decribed here -Youtube API docs
Code has been optimized to decrease quota usage. You can easily work with 50000 videos/day. For more please check your quota limit.

Installation

You need to install google-api-python-client to run this project. github API link Install this library in a virtualenv using pip.

Mac/Linux

pip3 install virtualenv
virtualenv venv
. venv/bin/activate
pip3 install -r requirements.txt

Windows

pip3 install virtualenv
virtualenv venv
venv\Scripts\activate
pip3 install -r requirements.txt

Working Guide

Get Your Youtube API key as shown in above video.
Pip install the requirements.txt
Run the program YT_Scrape.py

The script will ask for required data in the command line and is pretty self-explanatory (Once it runs)

View Samples

♥️ Contributing

There are several ways to help.

Spread the word: More users means more possible people testing and contributing to the app which in turn means better stability and possibly more and better features. You can or share it on LinkedIn. Every little bit helps !
Make a feature or improvement request: Something can be be done better? Something essential missing? Let us know!
Report bugs
Contribute: You don't have to be programmer to help.
1. Treat Me A Coffee Instead Paypal

Pull Requests

Pull requests are of course very welcome! Please make sure to also include the issue number in your commit message, if you're fixing a particular issue (e.g.: feat: add nice feature with the number #31).

Name		Name	Last commit message	Last commit date
Latest commit History 143 Commits
Assets		Assets
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Samples.md		Samples.md
YT_Scrape.py		YT_Scrape.py
old_history.py		old_history.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Youtube_scrape

✔️ Features

💻 Setup Guide

Youtube API

Installation

Mac/Linux

Windows

Working Guide

♥️ Contributing

Pull Requests

About

Releases

Packages

Contributors 2

Languages

License

CriticalHunter/Youtube_Scraper

Folders and files

Latest commit

History

Repository files navigation

Youtube_scrape

✔️ Features

💻 Setup Guide

Youtube API

Installation

Mac/Linux

Windows

Working Guide

♥️ Contributing

Pull Requests

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages