Skip to content
This repository has been archived by the owner on Nov 11, 2019. It is now read-only.

Use youtube-dl as a library #54

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Use youtube-dl as a library #54

wants to merge 4 commits into from

Conversation

redapple
Copy link

I got frustrated about the lack of progress information when running steve-cmd fetch,
so I wanted to use youtube-dl as a library to get metadata from playlists

Sample output for PyCon India 2015

@willkg ,
what can I test other than YouTube playlists?

raise ScraperError('youtube-dl said "{0}".'.format(cpe.output))
except OSError:
raise ScraperError('youtube-dl not installed or not on PATH.')
import youtube_dl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to put this with the other imports?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really no. fixing.

@@ -27,7 +25,7 @@ class YoutubeScraper(object):
def transform_item(self, item):
"""Converts youtube-dl output to richard fields"""
return {
'title': item['fulltitle'],
'title': item['title'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there docs where they list the elements they return? I don't know why some of these fields are changing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/youtube.py#L1452

        return {
            'id': video_id,
            'uploader': video_uploader,
            'uploader_id': video_uploader_id,
            'upload_date': upload_date,
            'title': video_title,
            'thumbnail': video_thumbnail,
            'description': video_description,
            'categories': video_categories,
            'tags': video_tags,
            'subtitles': video_subtitles,
            'automatic_captions': automatic_captions,
            'duration': video_duration,
            'age_limit': 18 if age_gate else 0,
            'annotations': video_annotations,
            'webpage_url': proto + '://www.youtube.com/watch?v=%s' % video_id,
            'view_count': view_count,
            'like_count': like_count,
            'dislike_count': dislike_count,
            'average_rating': float_or_none(video_info.get('avg_rating', [None])[0]),
            'formats': formats,
            'is_live': is_live,
            'start_time': start_time,
            'end_time': end_time,
        }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like "fulltitle" and "title" differ when title is truncated: https://github.com/rg3/youtube-dl/blob/b3613d36da14ab527166326707c0f911d192144d/youtube_dl/YoutubeDL.py#L1400

but I haven't found "fulltitle" when using a ydl

@codersquid
Copy link
Contributor

@redapple channels are a good thing to test. I scrape those too.

@redapple
Copy link
Author

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants