-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[franceculture] Fix extractor #27903
Conversation
@@ -36,12 +36,12 @@ def _real_extract(self, url): | |||
</h1>| | |||
<div[^>]+class="[^"]*?(?:title-zone-diffusion|heading-zone-(?:wrapper|player-button))[^"]*?"[^>]*> | |||
).*? | |||
(<button[^>]+data-asset-source="[^"]+"[^>]+>) | |||
(<button[^>]+data-(asset-source|url)="[^"]+"[^>]+>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't capture groups you don't use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replace data-(asset-source|url)
by data-(?:asset-source|url)
in the next commit. Is that what you expect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
''', | ||
webpage, 'video data')) | ||
|
||
video_url = video_data['data-asset-source'] | ||
title = video_data.get('data-asset-title') or self._og_search_title(webpage) | ||
video_url = video_data.get('data-asset-source') or video_data.get('data-url') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Video URL is mandatory. Read coding conventions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your help, that's my first contribution to this project and I'm still a bit lost after reading the coding conventions.
As far as I know, either data-asset-source
or data-url
can be found in the HTML, on the previous and current versions of the FranceCulture website, respectively. This is the root cause of the current extractor being broken, and the very reason of this PR. If both are missing, though, video_url
will be False
and I understand that's inappropriate and the need for a default value. Yet, I can't figure out what to use since none seem to make sense as the extractor will fail anyway.
What would be an acceptable solution?
Co-authored-by: Sergey M. <dstftw@gmail.com>
Please follow the guide below
x
into all the boxes [ ] relevant to your pull request (like that [x])Before submitting a pull request make sure you have:
In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
Description of your pull request and other information
Explanation of your pull request in arbitrary form goes here. Please make sure the description explains the purpose and effect of your pull request and is worded well enough to be understood. Provide as much context and examples as possible.
Thanks for this wonderful work!