-
Notifications
You must be signed in to change notification settings - Fork 11
Use youtube-dl as a library #54
base: master
Are you sure you want to change the base?
Conversation
raise ScraperError('youtube-dl said "{0}".'.format(cpe.output)) | ||
except OSError: | ||
raise ScraperError('youtube-dl not installed or not on PATH.') | ||
import youtube_dl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason not to put this with the other imports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really no. fixing.
@@ -27,7 +25,7 @@ class YoutubeScraper(object): | |||
def transform_item(self, item): | |||
"""Converts youtube-dl output to richard fields""" | |||
return { | |||
'title': item['fulltitle'], | |||
'title': item['title'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there docs where they list the elements they return? I don't know why some of these fields are changing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/youtube.py#L1452
return {
'id': video_id,
'uploader': video_uploader,
'uploader_id': video_uploader_id,
'upload_date': upload_date,
'title': video_title,
'thumbnail': video_thumbnail,
'description': video_description,
'categories': video_categories,
'tags': video_tags,
'subtitles': video_subtitles,
'automatic_captions': automatic_captions,
'duration': video_duration,
'age_limit': 18 if age_gate else 0,
'annotations': video_annotations,
'webpage_url': proto + '://www.youtube.com/watch?v=%s' % video_id,
'view_count': view_count,
'like_count': like_count,
'dislike_count': dislike_count,
'average_rating': float_or_none(video_info.get('avg_rating', [None])[0]),
'formats': formats,
'is_live': is_live,
'start_time': start_time,
'end_time': end_time,
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like "fulltitle"
and "title"
differ when title is truncated: https://github.com/rg3/youtube-dl/blob/b3613d36da14ab527166326707c0f911d192144d/youtube_dl/YoutubeDL.py#L1400
but I haven't found "fulltitle"
when using a ydl
@redapple channels are a good thing to test. I scrape those too. |
I got frustrated about the lack of progress information when running steve-cmd fetch,
so I wanted to use youtube-dl as a library to get metadata from playlists
Sample output for PyCon India 2015
@willkg ,
what can I test other than YouTube playlists?