Skip to content

Commit 16aaa9c

Browse files
Merge pull request prathimacode-hub#985 from BMaster123/main
Added YouTube Trending Videos Scraper
2 parents 7f17677 + 80d06bb commit 16aaa9c

File tree

4 files changed

+91
-0
lines changed

4 files changed

+91
-0
lines changed
Loading
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Package/Script Name
2+
3+
## Aim
4+
5+
The aim of this project is to make a web scraper that scrapes YouTube.
6+
7+
## Purpose
8+
9+
The purpose of this project is to get the top 10 trending videos in each of the categories
10+
11+
## Short description of package/script
12+
13+
- The program gets the top 10 trending videos in each of YouTube's 4 categories and stores them in an excel fileby opening Chrome and navigating to the pages it needs to scrape.
14+
- Imported libraries
15+
- BeautifulSoup
16+
- Selenium
17+
- Pandas
18+
- Requests
19+
20+
## Workflow of the Project
21+
22+
1. Import libraries
23+
2. Make the list of URLs to be scraped from
24+
3. Scrape the URLs
25+
4. Store the results in a pandas dataframe
26+
5. Create an excel file
27+
28+
29+
## Setup instructions
30+
31+
- Make sure BeautifulSoup, Selenium, and Pandas are installed
32+
- Make sure Chrome webdriver is installed
33+
- Run the program and an excel file will be created with the scraped information
34+
35+
36+
## Output
37+
38+
![image](WebScrapingScripts\YouTube Trending Videos Scraper\Images\Output.PNG)
39+
40+
41+
## Author(s)
42+
43+
Bhavesh Mandalapu
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Libraries used : BeautifulSoup, Selenium, Pandas, and requests
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
from bs4 import BeautifulSoup
2+
import requests
3+
from selenium import webdriver
4+
import pandas as pd
5+
6+
# URLs of the trending pages
7+
urls = [
8+
"bp=6gQJRkVleHBsb3Jl",
9+
"bp=4gINGgt5dG1hX2NoYXJ0cw%3D%3D",
10+
"bp=4gIcGhpnYW1pbmdfY29ycHVzX21vc3RfcG9wdWxhcg%3D%3D",
11+
"bp=4gIKGgh0cmFpbGVycw%3D%3D",
12+
]
13+
14+
# Opens Chrome
15+
driver = webdriver.Chrome()
16+
17+
# List of scraped video titles
18+
list = []
19+
20+
# Goes to each of the URLs in the urls list and gets the video titles of the top 10 trending videos
21+
for url in urls:
22+
driver.get(f"https://www.youtube.com/feed/trending?{url}")
23+
24+
content = driver.page_source.encode("utf8").strip()
25+
soup = BeautifulSoup(content, "lxml")
26+
titles = soup.find_all("a", id="video-title")
27+
28+
for i in range(0, 10):
29+
print(titles[i].text)
30+
list.append(titles[i].text)
31+
32+
# Makes a dictionary containing the 4 categories as the keys and the values as the top 10
33+
# trending videos in the section
34+
trending_dict = {
35+
"Now": list[:10],
36+
"Music": list[10:20],
37+
"Gaming": list[20:30],
38+
"Movies": list[29:39],
39+
}
40+
41+
# Makes a dataframe of trending_dict so that it can be made into an excel file
42+
df = pd.DataFrame(trending_dict)
43+
44+
# Makes the excel file
45+
writer = pd.ExcelWriter("YouTube-Trending.xlsx")
46+
df.to_excel(writer, "Sheet1", index=False)
47+
writer.save()

0 commit comments

Comments
 (0)