Skip to content

Commit b889629

Browse files
Copilotsgbaird
andcommitted
Simplify Playwright downloader to use YouTube Studio interface
Co-authored-by: sgbaird <45469701+sgbaird@users.noreply.github.com>
1 parent 00c4eba commit b889629

File tree

5 files changed

+163
-293
lines changed

5 files changed

+163
-293
lines changed

src/ac_training_lab/video_editing/README_playwright.md

Lines changed: 30 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,15 @@
11
# Playwright YouTube Downloader
22

3-
This module provides an alternative method for downloading YouTube videos using Playwright browser automation. This is particularly useful for downloading private or unlisted videos from owned channels that may not be accessible via traditional methods like yt-dlp.
3+
This module provides a lean method for downloading YouTube videos using Playwright browser automation and YouTube Studio interface. This is particularly useful for downloading private or unlisted videos from owned channels that may not be accessible via traditional methods like yt-dlp.
44

55
## Features
66

77
- **Browser Automation**: Uses Playwright to automate a real browser session
88
- **Google Account Login**: Automatically logs into a Google account to access owned videos
9-
- **Native YouTube Interface**: Uses YouTube's built-in download functionality
10-
- **Quality Selection**: Supports selecting video quality (720p, 1080p, etc.)
9+
- **YouTube Studio Interface**: Uses the three-dot ellipses menu in YouTube Studio for downloads
10+
- **Simple Configuration**: Minimal environment variables needed
1111
- **Multiple Videos**: Can download multiple videos in sequence
1212
- **Integration**: Integrates with existing yt-dlp functionality
13-
- **Flexible Configuration**: Environment variable based configuration
1413

1514
## Installation
1615

@@ -23,16 +22,14 @@ playwright install chromium
2322

2423
## Configuration
2524

26-
Set up your credentials and preferences using environment variables:
25+
Set up your credentials using environment variables:
2726

2827
```bash
2928
# Required credentials
3029
export GOOGLE_EMAIL="your-email@gmail.com"
3130
export GOOGLE_PASSWORD="your-app-password"
3231

3332
# Optional settings
34-
export YT_DOWNLOAD_DIR="./downloads"
35-
export YT_DEFAULT_QUALITY="720p"
3633
export YT_HEADLESS="true"
3734
export YT_PAGE_TIMEOUT="30000"
3835
export YT_DOWNLOAD_TIMEOUT="300"
@@ -52,13 +49,12 @@ export YT_CHANNEL_ID="UCHBzCfYpGwoqygH9YNh9A6g"
5249
```python
5350
from ac_training_lab.video_editing.playwright_yt_downloader import download_youtube_video_with_playwright
5451

55-
# Download a single video
52+
# Download a video from YouTube Studio
5653
downloaded_file = download_youtube_video_with_playwright(
57-
video_id="dQw4w9WgXcQ",
54+
video_id="cIQkfIUeuSM", # Example video ID from ac-hardware-streams
5855
email="your-email@gmail.com",
5956
password="your-app-password",
60-
download_dir="./downloads",
61-
quality="720p",
57+
channel_id="UCHBzCfYpGwoqygH9YNh9A6g", # ac-hardware-streams channel
6258
headless=True
6359
)
6460

@@ -75,7 +71,6 @@ from ac_training_lab.video_editing.playwright_yt_downloader import YouTubePlaywr
7571
with YouTubePlaywrightDownloader(
7672
email="your-email@gmail.com",
7773
password="your-app-password",
78-
download_dir="./downloads",
7974
headless=False # Show browser for debugging
8075
) as downloader:
8176

@@ -84,12 +79,13 @@ with YouTubePlaywrightDownloader(
8479
downloader.navigate_to_youtube()
8580

8681
# Download multiple videos
87-
video_ids = ["video1", "video2", "video3"]
88-
results = downloader.download_videos_from_list(video_ids, quality="1080p")
82+
video_ids = ["cIQkfIUeuSM", "another_video_id"]
83+
channel_id = "UCHBzCfYpGwoqygH9YNh9A6g" # ac-hardware-streams
8984

90-
for video_id, file_path in results.items():
91-
if file_path:
92-
print(f"{video_id}: {file_path}")
85+
for video_id in video_ids:
86+
result = downloader.download_video(video_id, channel_id)
87+
if result:
88+
print(f"{video_id}: {result}")
9389
else:
9490
print(f"{video_id}: Failed")
9591
```
@@ -108,8 +104,7 @@ manager = YouTubeDownloadManager(use_playwright=True)
108104
result = manager.download_latest_from_channel(
109105
channel_id="UCHBzCfYpGwoqygH9YNh9A6g",
110106
device_name="Opentrons OT-2",
111-
method="playwright", # or "ytdlp"
112-
quality="720p"
107+
method="playwright" # or "ytdlp"
113108
)
114109

115110
if result['success']:
@@ -123,9 +118,9 @@ else:
123118
```bash
124119
# Download specific video with Playwright
125120
python -m ac_training_lab.video_editing.integrated_downloader \
126-
--video-id dQw4w9WgXcQ \
127-
--method playwright \
128-
--quality 720p
121+
--video-id cIQkfIUeuSM \
122+
--channel-id UCHBzCfYpGwoqygH9YNh9A6g \
123+
--method playwright
129124

130125
# Download latest from channel with yt-dlp
131126
python -m ac_training_lab.video_editing.integrated_downloader \
@@ -143,26 +138,25 @@ python -m ac_training_lab.video_editing.integrated_downloader \
143138

144139
1. **Browser Launch**: Starts a Chromium browser instance with download settings
145140
2. **Google Login**: Navigates to Google sign-in and enters credentials
146-
3. **YouTube Navigation**: Goes to YouTube and verifies login status
147-
4. **Video Access**: Navigates to specific video pages
148-
5. **Download Trigger**: Finds and clicks the download button in YouTube's interface
149-
6. **Quality Selection**: Chooses the preferred video quality
150-
7. **Download Monitoring**: Waits for download completion and returns file path
141+
3. **YouTube Studio Navigation**: Goes to YouTube Studio for the specific video
142+
4. **Three-Dot Menu**: Finds and clicks the three vertical ellipses (⋮) button
143+
5. **Download Option**: Selects the "Download" option from the dropdown menu
144+
6. **Download Monitoring**: Waits for download completion and returns file path
151145

152146
## Browser Selectors
153147

154-
The downloader uses multiple fallback selectors to find YouTube's download interface elements, as these can change over time:
148+
The downloader uses multiple fallback selectors to find YouTube Studio's interface elements, as these can change over time:
155149

156-
- Download buttons: `button[aria-label*="Download"]`, `button:has-text("Download")`, etc.
157-
- Three-dot menus: `button[aria-label*="More actions"]`, `yt-icon-button[aria-label*="More"]`, etc.
158-
- Quality options: Text-based and aria-label selectors
150+
- **Three-dot ellipses menus**: `button[aria-label*="More"]`, `button:has-text("")`, etc.
151+
- **Download options**: `text="Download"`, `button:has-text("Download")`, etc.
152+
- **Studio pages**: `[data-testid="video-editor"]` for page load verification
159153

160154
## Error Handling
161155

162156
The system includes comprehensive error handling for:
163157

164158
- **Authentication failures**: Invalid credentials, 2FA requirements
165-
- **Network timeouts**: Configurable timeout values
159+
- **Network timeouts**: Configurable timeout values
166160
- **Element not found**: Multiple selector fallbacks
167161
- **Download failures**: File system and browser download issues
168162

@@ -175,15 +169,15 @@ The system includes comprehensive error handling for:
175169
- Use App Password for 2FA accounts
176170
- Verify account access to target videos
177171

178-
2. **Download Button Not Found**
172+
2. **Three-Dot Menu Not Found**
179173
- Video may not have download option
180-
- Account may not have permission
181-
- YouTube interface may have changed
174+
- Account may not have permission to video
175+
- YouTube Studio interface may have changed
182176

183177
3. **Download Timeout**
184178
- Increase `YT_DOWNLOAD_TIMEOUT`
185179
- Check network connection
186-
- Try lower quality setting
180+
- Ensure sufficient disk space
187181

188182
4. **Browser Issues**
189183
- Run `playwright install chromium`

src/ac_training_lab/video_editing/integrated_downloader.py

Lines changed: 15 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -88,26 +88,23 @@ def download_video_ytdlp(self, video_id: str) -> bool:
8888

8989
def download_video_playwright(self,
9090
video_id: str,
91-
quality: Optional[str] = None) -> Optional[str]:
91+
channel_id: Optional[str] = None) -> Optional[str]:
9292
"""
93-
Download video using Playwright method.
93+
Download video using Playwright method with YouTube Studio.
9494
9595
Args:
9696
video_id: YouTube video ID
97-
quality: Video quality preference
97+
channel_id: YouTube channel ID (optional, helps with navigation)
9898
9999
Returns:
100100
Optional[str]: Path to downloaded file or None if failed
101101
"""
102-
try:
103-
quality = quality or self.config.default_quality
104-
102+
try:
105103
return download_youtube_video_with_playwright(
106104
video_id=video_id,
107105
email=self.config.google_email,
108106
password=self.config.google_password,
109-
download_dir=self.config.download_dir,
110-
quality=quality,
107+
channel_id=channel_id,
111108
headless=self.config.headless
112109
)
113110
except Exception as e:
@@ -117,14 +114,14 @@ def download_video_playwright(self,
117114
def download_video(self,
118115
video_id: str,
119116
method: Optional[str] = None,
120-
quality: Optional[str] = None) -> Dict[str, Any]:
117+
channel_id: Optional[str] = None) -> Dict[str, Any]:
121118
"""
122119
Download video using specified or default method.
123120
124121
Args:
125122
video_id: YouTube video ID
126123
method: Download method ('ytdlp' or 'playwright'), uses default if None
127-
quality: Video quality preference (only for Playwright)
124+
channel_id: YouTube channel ID (only for Playwright method)
128125
129126
Returns:
130127
Dict[str, Any]: Download result with status and file path
@@ -142,7 +139,7 @@ def download_video(self,
142139

143140
try:
144141
if use_playwright:
145-
file_path = self.download_video_playwright(video_id, quality)
142+
file_path = self.download_video_playwright(video_id, channel_id)
146143
if file_path:
147144
result['success'] = True
148145
result['file_path'] = file_path
@@ -164,8 +161,7 @@ def download_latest_from_channel(self,
164161
channel_id: Optional[str] = None,
165162
device_name: Optional[str] = None,
166163
playlist_id: Optional[str] = None,
167-
method: Optional[str] = None,
168-
quality: Optional[str] = None) -> Dict[str, Any]:
164+
method: Optional[str] = None) -> Dict[str, Any]:
169165
"""
170166
Download the latest video from a channel.
171167
@@ -174,7 +170,6 @@ def download_latest_from_channel(self,
174170
device_name: Device name to filter playlists
175171
playlist_id: Specific playlist ID
176172
method: Download method ('ytdlp' or 'playwright')
177-
quality: Video quality preference
178173
179174
Returns:
180175
Dict[str, Any]: Download result
@@ -194,19 +189,19 @@ def download_latest_from_channel(self,
194189
}
195190

196191
# Download the video
197-
return self.download_video(video_id, method, quality)
192+
return self.download_video(video_id, method, channel_id)
198193

199194
def download_multiple_videos(self,
200195
video_ids: List[str],
201196
method: Optional[str] = None,
202-
quality: Optional[str] = None) -> Dict[str, Dict[str, Any]]:
197+
channel_id: Optional[str] = None) -> Dict[str, Dict[str, Any]]:
203198
"""
204199
Download multiple videos.
205200
206201
Args:
207202
video_ids: List of YouTube video IDs
208203
method: Download method ('ytdlp' or 'playwright')
209-
quality: Video quality preference
204+
channel_id: YouTube channel ID (for Playwright method)
210205
211206
Returns:
212207
Dict[str, Dict[str, Any]]: Results for each video
@@ -215,7 +210,7 @@ def download_multiple_videos(self,
215210

216211
for video_id in video_ids:
217212
logger.info(f"Downloading video {video_id} ({len(results)+1}/{len(video_ids)})")
218-
results[video_id] = self.download_video(video_id, method, quality)
213+
results[video_id] = self.download_video(video_id, method, channel_id)
219214

220215
return results
221216

@@ -231,7 +226,6 @@ def main():
231226
parser.add_argument('--playlist-id', help='Specific playlist ID')
232227
parser.add_argument('--method', choices=['ytdlp', 'playwright'],
233228
help='Download method (default: ytdlp)')
234-
parser.add_argument('--quality', default='720p', help='Video quality for Playwright (default: 720p)')
235229
parser.add_argument('--use-playwright', action='store_true',
236230
help='Use Playwright by default')
237231

@@ -251,16 +245,15 @@ def main():
251245
result = manager.download_video(
252246
video_id=args.video_id,
253247
method=args.method,
254-
quality=args.quality
248+
channel_id=args.channel_id
255249
)
256250
else:
257251
# Download latest from channel
258252
result = manager.download_latest_from_channel(
259253
channel_id=args.channel_id,
260254
device_name=args.device_name,
261255
playlist_id=args.playlist_id,
262-
method=args.method,
263-
quality=args.quality
256+
method=args.method
264257
)
265258

266259
# Print result

src/ac_training_lab/video_editing/playwright_config.py

Lines changed: 6 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
"""
22
Configuration for Playwright YouTube downloader.
33
4-
This file contains example configuration and credential management
4+
This file contains lean configuration and credential management
55
for the Playwright YouTube downloader.
66
"""
77

@@ -10,7 +10,7 @@
1010

1111

1212
class PlaywrightYTConfig:
13-
"""Configuration class for Playwright YouTube downloader."""
13+
"""Simplified configuration class for Playwright YouTube downloader."""
1414

1515
def __init__(self):
1616
"""Initialize configuration with environment variables and defaults."""
@@ -19,19 +19,17 @@ def __init__(self):
1919
self.google_email = os.getenv("GOOGLE_EMAIL")
2020
self.google_password = os.getenv("GOOGLE_PASSWORD")
2121

22-
# Download settings
23-
self.download_dir = os.getenv("YT_DOWNLOAD_DIR", "./downloads")
24-
self.default_quality = os.getenv("YT_DEFAULT_QUALITY", "720p")
22+
# Browser settings
2523
self.headless = os.getenv("YT_HEADLESS", "true").lower() == "true"
2624

2725
# Timeout settings (in milliseconds)
2826
self.page_timeout = int(os.getenv("YT_PAGE_TIMEOUT", "30000"))
2927
self.download_timeout = int(os.getenv("YT_DOWNLOAD_TIMEOUT", "300")) # seconds
3028

31-
# Channel and playlist settings
29+
# Channel settings
3230
self.default_channel_id = os.getenv("YT_CHANNEL_ID", "UCHBzCfYpGwoqygH9YNh9A6g")
3331

34-
# Browser settings
32+
# Browser user agent
3533
self.user_agent = os.getenv("YT_USER_AGENT",
3634
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
3735
"(KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
@@ -62,8 +60,6 @@ def to_dict(self) -> Dict[str, Any]:
6260
Dict[str, Any]: Configuration as dictionary (excluding sensitive data)
6361
"""
6462
return {
65-
"download_dir": self.download_dir,
66-
"default_quality": self.default_quality,
6763
"headless": self.headless,
6864
"page_timeout": self.page_timeout,
6965
"download_timeout": self.download_timeout,
@@ -73,7 +69,7 @@ def to_dict(self) -> Dict[str, Any]:
7369
}
7470

7571

76-
# Example environment variables setup
72+
# Example environment variables setup (simplified)
7773
EXAMPLE_ENV_VARS = """
7874
# Copy these to your .env file or set as environment variables
7975
@@ -82,8 +78,6 @@ def to_dict(self) -> Dict[str, Any]:
8278
GOOGLE_PASSWORD=your-app-password
8379
8480
# Optional settings
85-
YT_DOWNLOAD_DIR=./downloads
86-
YT_DEFAULT_QUALITY=720p
8781
YT_HEADLESS=true
8882
YT_PAGE_TIMEOUT=30000
8983
YT_DOWNLOAD_TIMEOUT=300

0 commit comments

Comments
 (0)