Description
Test command: youtube-dlc ytsearch:lol --flat-playlist -J --verbose
For searches, youtube-dl/c tries to download some representation of the search page encoded as JSON which contains HTML strings, visible around youtube.py:3289
:
data = self._download_json( ...
html_content = data[1]['body']['content']
However when this code is executed the _download_json
line fails because it tried to parse HTML as JSON. This is because the query parameter that youtube-dl/c was using, spf=navigate
, is now ignored by YouTube, so YouTube just returns an ordinary page of results.
There may now be a different query parameter that gets the results in the same format, but if there is, I don't know what it is.
Otherwise we'll have to request the data from YouTube in a different format. Here's what I've got to on that:
post_data = {
'context': {
'client': {
'clientName': 'WEB',
'clientVersion': '2.20201022.01.01',
}
},
'query': query # the search query goes here
}
result_url = 'https://www.youtube.com/youtubei/v1/search?key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8' # this key is the same globally
and add these parameters to _download_json
:
data=json.dumps(post_data).encode('utf-8'),
headers={
'content-type': 'application/json'
}
Now you have a completely JSON representation of the results, which you can step into with:
data.contents.twoColumnSearchResultsRenderer.primaryContents.sectionListRenderer.contents[1].itemSectionRenderer.contents
Depending on the search terms, sometimes the 1
index is a 0
.
I don't have the energy to continue arranging the data into a format that the rest of the code likes. Hopefully someone can pick up from my work.
Peace.