Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to extract urplayer data #30504

Closed
Tygry1 opened this issue Jan 13, 2022 · 3 comments · Fixed by #30506
Closed

Unable to extract urplayer data #30504

Tygry1 opened this issue Jan 13, 2022 · 3 comments · Fixed by #30506

Comments

@Tygry1
Copy link

Tygry1 commented Jan 13, 2022

I am using the latest version and when in trying to download a video from this page i get "unable to extract urplayer data".

https://urplay.se/program/214542-historien-om-kalla-kriget-skrack-styr-varlden

Anyone know a fix?

@dirkf
Copy link
Contributor

dirkf commented Jan 13, 2022

The site has moved its React-ish hydration JSON from an element tagged with data-react-class to a typical Next.js structure in a <script> with id __NEXT_DATA__.

Fortunately the JSON is much the same, as far as the code shows, though we get a valid but different category in the test. We can also extract an age_limit:

--- old/youtube-dl/youtube_dl/extractor/urplay.py
+++ new/youtube-dl/youtube_dl/extractor/urplay.py
@@ -5,6 +5,8 @@
 from ..utils import (
     dict_get,
     int_or_none,
+    parse_age_limit,
+    try_get,
     unified_timestamp,
 )
 
@@ -23,9 +25,10 @@
             'upload_date': '20171214',
             'series': 'UR Samtiden - Livet, universum och rymdens märkliga musik',
             'duration': 2269,
-            'categories': ['Kultur & historia'],
+            'categories': ['Vetenskap & teknik'],
             'tags': ['Kritiskt tänkande', 'Vetenskap', 'Vetenskaplig verksamhet'],
             'episode': 'Om vetenskap, kritiskt tänkande och motstånd',
+            'age_limit': 15,
         },
     }, {
         'url': 'https://urskola.se/Produkter/190031-Tripp-Trapp-Trad-Sovkudde',
@@ -51,10 +54,19 @@
         url = url.replace('skola.se/Produkter', 'play.se/program')
         webpage = self._download_webpage(url, video_id)
         vid = int(video_id)
+        urplayer_data = self._search_regex(
+            r'(?s)\bid\s*=\s*"__NEXT_DATA__"[^>]*>\s*({.+?})\s*</script',
+            webpage, 'urplayer next data', fatal=False) or {}
+        if urplayer_data:
+            urplayer_data = self._parse_json(urplayer_data, video_id, fatal=False)
+            urplayer_data = try_get(urplayer_data, lambda x: x['props']['pageProps']['program'], dict)
+            if not urplayer_data:
+                raise ExtractorError('Unable to parse __NEXT_DATA__')
+        else:
-        accessible_episodes = self._parse_json(self._html_search_regex(
-            r'data-react-class="routes/Product/components/ProgramContainer/ProgramContainer"[^>]+data-react-props="({.+?})"',
-            webpage, 'urplayer data'), video_id)['accessibleEpisodes']
-        urplayer_data = next(e for e in accessible_episodes if e.get('id') == vid)
+            accessible_episodes = self._parse_json(self._html_search_regex(
+                r'data-react-class="routes/Product/components/ProgramContainer/ProgramContainer"[^>]+data-react-props="({.+?})"',
+                webpage, 'urplayer data'), video_id)['accessibleEpisodes']
+            urplayer_data = next(e for e in accessible_episodes if e.get('id') == vid)
         episode = urplayer_data['title']
         raw_streaming_info = urplayer_data['streamingInfo']['raw']
         host = self._download_json(
@@ -104,4 +116,6 @@
             'season': series.get('label'),
             'episode': episode,
             'episode_number': int_or_none(urplayer_data.get('episodeNumber')),
+            'age_limit': parse_age_limit(min(try_get(a, lambda x: x['from'], int) or 0 
+                                             for a in urplayer_data.get('ageRanges', []))),
         }

To do: subtitles may be available in urplayer_data['streamingInfo']['sweComplete'].

Then:

$ python -m youtube_dl -F -v 'https://urplay.se/program/214542-historien-om-kalla-kriget-skrack-styr-varlden'[debug] System config: [u'--prefer-ffmpeg']
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-F', u'-v', u'https://urplay.se/program/214542-historien-om-kalla-kriget-skrack-styr-varlden']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Git HEAD: 5014bd67c
[debug] Python version 2.7.17 (CPython) - Linux-4.4.0-210-generic-i686-with-Ubuntu-16.04-xenial
[debug] exe versions: avconv 4.3, avprobe 4.3, ffmpeg 4.3, ffprobe 4.3
[debug] Proxy map: {}
[URPlay] 214542: Downloading webpage
[URPlay] 214542: Downloading JSON metadata
[URPlay] 214542: Downloading m3u8 information
[URPlay] 214542: Downloading MPD manifest
[URPlay] 214542: Downloading m3u8 information
[URPlay] 214542: Downloading MPD manifest
[info] Available formats for 214542:
format code           extension  resolution note
dash-p0aa0br125561-0  m4a        audio only [eng] DASH audio  125k , m4a_dash container, mp4a.40.2 (44100Hz)
dash-p0aa0br125561-1  m4a        audio only [eng] DASH audio  125k , m4a_dash container, mp4a.40.2 (44100Hz)
dash-p0va0br705357-0  mp4        640x360    DASH video  705k , mp4_dash container, avc1.42001e, 25fps, video only
dash-p0va0br705357-1  mp4        640x360    DASH video  705k , mp4_dash container, avc1.42001e, 25fps, video only
hls-923-0             mp4        640x360     923k , avc1.42001e, mp4a.40.2
hls-923-1             mp4        640x360     923k , avc1.42001e, mp4a.40.2 (best)
$

@Tygry1
Copy link
Author

Tygry1 commented Jan 13, 2022

As a complete beginner. Should I copy paste that into CMD to download the video?

@dirkf
Copy link
Contributor

dirkf commented Jan 13, 2022

As a complete beginner. ...

... wait for an update in the next release!

Patching your yt-dl has been discussed elsewhere owing to the lack of releases, but it's particularly tricky for the single self-extracting executable archive build typically used for Windows. In principle, you find the extractor/urplay.py file, either extracting the program source from your build or downloading it, patch it with the patch program (or manually), then rebuild your yt-dl.

Hence the advice above.

However running the patched extractor with -g/--get-url gives this URL that you should be able pass to the release yt-dl:

http://streaming10.ur.se/urplay/_definst_/mp4:se/214000-214999/214542-4.mp4/manifest.mpd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants