Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vimeo - Failed to parse JSON caused by JSONDecodeError #32271

Open
1 task done
xescar opened this issue Jun 4, 2023 · 5 comments
Open
1 task done

Vimeo - Failed to parse JSON caused by JSONDecodeError #32271

xescar opened this issue Jun 4, 2023 · 5 comments
Labels
broken-IE problem with existing site extraction patch-available

Comments

@xescar
Copy link

xescar commented Jun 4, 2023

Checklist

  • [X ] I'm reporting a broken site support issue
  • I've verified that I'm running youtube-dl version 2021.12.17
  • [X ] I've checked that all provided URLs are alive and playable in a browser
  • [ X] I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • [ X] I've searched the bugtracker for similar bug reports including closed ones
  • [ X] I've read bugs section in FAQ

Verbose log


python2 -m youtube_dl -v -F 'https://player.vimeo.com/video/762862842'

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'-F', u'https://player.vimeo.com/video/762862842']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Git HEAD: b8a86dcf1
[debug] Python 2.7.18 (CPython x86_64 64bit) - Linux-5.15.0-73-generic-x86_64-with-LinuxMint-20.3-una - OpenSSL 1.1.1f  31 Mar 2020 - glibc 2.29
[debug] exe versions: ffmpeg 4.2.7, ffprobe 4.2.7, rtmpdump 2.4
[debug] Proxy map: {}
[vimeo] 762862842: Downloading webpage
ERROR: 762862842: Failed to parse JSON  (caused by ValueError('Extra data: line 1 column 12467 - line 2 column 1291 (char 12466 - 13766)',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "youtube_dl/extractor/common.py", line 907, in _parse_json
    return json.loads(json_string)
  File "/usr/lib/python2.7/json/__init__.py", line 339, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 367, in decode
    raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 12467 - line 2 column 1291 (char 12466 - 13766)
Traceback (most recent call last):
  File "youtube_dl/YoutubeDL.py", line 825, in wrapper
    return func(self, *args, **kwargs)
  File "youtube_dl/YoutubeDL.py", line 846, in __extract_info
    ie_result = ie.extract(url)
  File "youtube_dl/extractor/common.py", line 535, in extract
    ie_result = self._real_extract(url)
  File "youtube_dl/extractor/vimeo.py", line 677, in _real_extract
    r'(?s)\b(?:playerC|c)onfig\s*=\s*({.+?})\s*[;\n]', webpage, 'info section'), video_id)
  File "youtube_dl/extractor/common.py", line 911, in _parse_json
    raise ExtractorError(errmsg, cause=ve)
ExtractorError: 762862842: Failed to parse JSON  (caused by ValueError('Extra data: line 1 column 12467 - line 2 column 1291 (char 12466 - 13766)',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Description

I have this issue not only with Python 2.7 but also Python 3.8

I have cloned git repository to be 100% I have the most recent version and the error still persists

I fails for public Vimeo videos. Actually I was able to download this video in the past.

Maybe this issue has something in common with issue #32258 ([VK] video not downloading (Failed to parse JSON) )

@dirkf
Copy link
Contributor

dirkf commented Jun 4, 2023

Only incidentally. Vimeo has changed some part of the page (again) and we're not matching it properly.

yt-dlp succeeds with this page because it has a special "lenient" JSON parser that ignores extra data.

Fix options:

  1. change the search pattern so that it doesn't match beyond the JSON that we want
  2. change the JSON parser to detect this case and retry
  3. back-port the yt-dlp JSON parser.

Fix#3 is the best in the long term but fix#1 is easy:

--- old/youtube_dl/extractor/vimeo.py
+++ new/youtube_dl/extractor/vimeo.py
@@ -674,7 +674,7 @@
 
         if '//player.vimeo.com/video/' in url:
             config = self._parse_json(self._search_regex(
-                r'(?s)\b(?:playerC|c)onfig\s*=\s*({.+?})\s*[;\n]', webpage, 'info section'), video_id)
+                r'(?s)\b(?:playerC|c)onfig\s*=\s*(\{.+?})\s*(?:[;\n]|</script>)', webpage, 'info section'), video_id)
             if config.get('view') == 4:
                 config = self._verify_player_video_password(
                     redirect_url, video_id, headers)

You can apply this to your checked-out yt-dl module to get this result:

$ python -m youtube_dl -v -F 'https://player.vimeo.com/video/762862842'
[debug] System config: [u'--prefer-ffmpeg']
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'-F', u'https://player.vimeo.com/video/762862842']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Git HEAD: 5966b4309
[debug] Python 2.7.18 (CPython i686 32bit) - Linux-4.4.0-210-generic-i686-with-Ubuntu-16.04-xenial - OpenSSL 1.1.1t  7 Feb 2023 - glibc 2.15
[debug] exe versions: avconv 4.3, avprobe 4.3, ffmpeg 4.3, ffprobe 4.3
[debug] Proxy map: {}
[vimeo] 762862842: Downloading webpage
[vimeo] 762862842: Downloading akfire_interconnect_quic m3u8 information
[vimeo] 762862842: Downloading akfire_interconnect_quic m3u8 information
[vimeo] 762862842: Downloading fastly_skyfire m3u8 information
[vimeo] 762862842: Downloading fastly_skyfire m3u8 information
[vimeo] 762862842: Downloading akfire_interconnect_quic MPD information
[vimeo] 762862842: Downloading akfire_interconnect_quic MPD information
[vimeo] 762862842: Downloading fastly_skyfire MPD information
[vimeo] 762862842: Downloading fastly_skyfire MPD information
[info] Available formats for 762862842:
format code                                          extension  resolution note
hls-akfire_interconnect_quic_sep-audio-medium-audio  mp4        audio only 
hls-fastly_skyfire_sep-audio-medium-audio            mp4        audio only 
dash-akfire_interconnect_quic_sep-audio-f73b48d3     m4a        audio only DASH audio   64k , m4a_dash container, mp4a.40.2 (24000Hz)
dash-fastly_skyfire_sep-audio-f73b48d3               m4a        audio only DASH audio   64k , m4a_dash container, mp4a.40.2 (24000Hz)
dash-akfire_interconnect_quic_sep-audio-c4b195b3     m4a        audio only DASH audio   67k , m4a_dash container, opus  (48000Hz)
dash-fastly_skyfire_sep-audio-c4b195b3               m4a        audio only DASH audio   67k , m4a_dash container, opus  (48000Hz)
dash-akfire_interconnect_quic_sep-audio-33b5709a     m4a        audio only DASH audio   99k , m4a_dash container, opus  (48000Hz)
dash-fastly_skyfire_sep-audio-33b5709a               m4a        audio only DASH audio   99k , m4a_dash container, opus  (48000Hz)
dash-akfire_interconnect_quic_sep-audio-ea18f4fb     m4a        audio only DASH audio  127k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-fastly_skyfire_sep-audio-ea18f4fb               m4a        audio only DASH audio  127k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-akfire_interconnect_quic_sep-audio-449c0d17     m4a        audio only DASH audio  191k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-fastly_skyfire_sep-audio-449c0d17               m4a        audio only DASH audio  191k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-akfire_interconnect_quic_sep-audio-bf242b57     m4a        audio only DASH audio  255k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-fastly_skyfire_sep-audio-bf242b57               m4a        audio only DASH audio  255k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-akfire_interconnect_quic_sep-video-f73b48d3     mp4        426x240    DASH video  341k , mp4_dash container, avc1.640015, 25fps, video only
dash-fastly_skyfire_sep-video-f73b48d3               mp4        426x240    DASH video  341k , mp4_dash container, avc1.640015, 25fps, video only
hls-akfire_interconnect_quic_sep-435                 mp4        426x240     435k , avc1.640015, 25.0fps, video only
hls-fastly_skyfire_sep-435                           mp4        426x240     435k , avc1.640015, 25.0fps, video only
dash-akfire_interconnect_quic_sep-video-ea18f4fb     mp4        640x360    DASH video  777k , mp4_dash container, avc1.64001E, 25fps, video only
dash-fastly_skyfire_sep-video-ea18f4fb               mp4        640x360    DASH video  777k , mp4_dash container, avc1.64001E, 25fps, video only
hls-akfire_interconnect_quic_sep-809                 mp4        640x360     809k , avc1.64001E, 25.0fps, video only
hls-fastly_skyfire_sep-809                           mp4        640x360     809k , avc1.64001E, 25.0fps, video only
hls-akfire_interconnect_quic_sep-1357                mp4        960x540    1357k , avc1.64001F, 25.0fps, video only
hls-fastly_skyfire_sep-1357                          mp4        960x540    1357k , avc1.64001F, 25.0fps, video only
dash-akfire_interconnect_quic_sep-video-bf242b57     mp4        960x540    DASH video 1614k , mp4_dash container, avc1.64001F, 25fps, video only
dash-fastly_skyfire_sep-video-bf242b57               mp4        960x540    DASH video 1614k , mp4_dash container, avc1.64001F, 25fps, video only
http-240p                                            mp4        426x240    25fps
hls-akfire_interconnect_quic-372                     mp4        426x240     372k , avc1.640015, 25.0fps, mp4a.40.2
hls-fastly_skyfire-372                               mp4        426x240     372k , avc1.640015, 25.0fps, mp4a.40.2
dash-akfire_interconnect_quic-video-f73b48d3         mp4        426x240    DASH video  405k , mp4_dash container, avc1.640015, 25fps, mp4a.40.2 (24000Hz)
dash-fastly_skyfire-video-f73b48d3                   mp4        426x240    DASH video  405k , mp4_dash container, avc1.640015, 25fps, mp4a.40.2 (24000Hz)
http-360p                                            mp4        640x360    25fps
hls-akfire_interconnect_quic-809                     mp4        640x360     809k , avc1.64001E, 25.0fps, mp4a.40.2
hls-fastly_skyfire-809                               mp4        640x360     809k , avc1.64001E, 25.0fps, mp4a.40.2
dash-akfire_interconnect_quic-video-ea18f4fb         mp4        640x360    DASH video  904k , mp4_dash container, avc1.64001E, 25fps, mp4a.40.2 (48000Hz)
dash-fastly_skyfire-video-ea18f4fb                   mp4        640x360    DASH video  904k , mp4_dash container, avc1.64001E, 25fps, mp4a.40.2 (48000Hz)
http-540p                                            mp4        960x540    25fps
hls-akfire_interconnect_quic-1485                    mp4        960x540    1485k , avc1.64001F, 25.0fps, mp4a.40.2
hls-fastly_skyfire-1485                              mp4        960x540    1485k , avc1.64001F, 25.0fps, mp4a.40.2
dash-akfire_interconnect_quic-video-bf242b57         mp4        960x540    DASH video 1869k , mp4_dash container, avc1.64001F, 25fps, mp4a.40.2 (48000Hz)
dash-fastly_skyfire-video-bf242b57                   mp4        960x540    DASH video 1869k , mp4_dash container, avc1.64001F, 25fps, mp4a.40.2 (48000Hz) (best)
$

@limes007
Copy link

Same problem here, the patch above is working. Thanks @dirkf

@dirkf
Copy link
Contributor

dirkf commented Jun 23, 2023

The patch should have been committed by now ...

@limes007
Copy link

Would be good, but is not.

@dirkf
Copy link
Contributor

dirkf commented Jun 24, 2023

I mean, I should have done it. Will happen, but some prior QA is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broken-IE problem with existing site extraction patch-available
Projects
None yet
Development

No branches or pull requests

3 participants