Skip to content

Commit

Permalink
[cleanup, docs] Misc cleanup
Browse files Browse the repository at this point in the history
Closes yt-dlp#2828, closes yt-dlp#2734, closes yt-dlp#2802, closes yt-dlp#2937
  • Loading branch information
pukkandan committed Mar 8, 2022
1 parent c89bec2 commit 08d3015
Show file tree
Hide file tree
Showing 20 changed files with 114 additions and 87 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ cookies

*.3gp
*.ape
*.ass
*.avi
*.desktop
*.flac
Expand Down Expand Up @@ -106,6 +107,7 @@ yt-dlp.zip
*.iml
.vscode
*.sublime-*
*.code-workspace

# Lazy extractors
*/extractor/lazy_extractors.py
Expand Down
12 changes: 11 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
- [Is anyone going to need the feature?](#is-anyone-going-to-need-the-feature)
- [Is your question about yt-dlp?](#is-your-question-about-yt-dlp)
- [Are you willing to share account details if needed?](#are-you-willing-to-share-account-details-if-needed)
- [Is the website primarily used for piracy](#is-the-website-primarily-used-for-piracy)
- [DEVELOPER INSTRUCTIONS](#developer-instructions)
- [Adding new feature or making overarching changes](#adding-new-feature-or-making-overarching-changes)
- [Adding support for a new site](#adding-support-for-a-new-site)
Expand All @@ -24,6 +25,7 @@
- [Collapse fallbacks](#collapse-fallbacks)
- [Trailing parentheses](#trailing-parentheses)
- [Use convenience conversion and parsing functions](#use-convenience-conversion-and-parsing-functions)
- [My pull request is labeled pending-fixes](#my-pull-request-is-labeled-pending-fixes)
- [EMBEDDING YT-DLP](README.md#embedding-yt-dlp)


Expand Down Expand Up @@ -123,6 +125,10 @@ While these steps won't necessarily ensure that no misuse of the account takes p
- Change the password before sharing the account to something random (use [this](https://passwordsgenerator.net/) if you don't have a random password generator).
- Change the password after receiving the account back.

### Is the website primarily used for piracy?

We follow [youtube-dl's policy](https://github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free) to not support services that is primarily used for infringing copyright. Additionally, it has been decided to not to support porn sites that specialize in deep fake. We also cannot support any service that serves only [DRM protected content](https://en.wikipedia.org/wiki/Digital_rights_management).




Expand Down Expand Up @@ -210,7 +216,7 @@ After you have ensured this site is distributing its content legally, you can fo
}
```
1. Add an import in [`yt_dlp/extractor/extractors.py`](yt_dlp/extractor/extractors.py).
1. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, the tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. You can also run all the tests in one go with `TestDownload.test_YourExtractor_all`
1. Run `python test/test_download.py TestDownload.test_YourExtractor` (note that `YourExtractor` doesn't end with `IE`). This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, the tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. You can also run all the tests in one go with `TestDownload.test_YourExtractor_all`
1. Make sure you have atleast one test for your extractor. Even if all videos covered by the extractor are expected to be inaccessible for automated testing, tests should still be added with a `skip` parameter indicating why the particular test is disabled from running.
1. Have a look at [`yt_dlp/extractor/common.py`](yt_dlp/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](yt_dlp/extractor/common.py#L91-L426). Add tests and code for as many as you want.
1. Make sure your code follows [yt-dlp coding conventions](#yt-dlp-coding-conventions) and check the code with [flake8](https://flake8.pycqa.org/en/latest/index.html#quickstart):
Expand Down Expand Up @@ -658,6 +664,10 @@ duration = float_or_none(video.get('durationMs'), scale=1000)
view_count = int_or_none(video.get('views'))
```

# My pull request is labeled pending-fixes

The `pending-fixes` label is added when there are changes requested to a PR. When the necessary changes are made, the label should be removed. However, despite our best efforts, it may sometimes happen that the maintainer did not see the changes or forgot to remove the label. If your PR is still marked as `pending-fixes` a few days after all requested changes have been made, feel free to ping the maintainer who labeled your issue and ask them to re-review and remove the label.




Expand Down
4 changes: 2 additions & 2 deletions CONTRIBUTORS
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ chio0hai
cntrl-s
Deer-Spangle
DEvmIb
Grabien
Grabien/MaximVol
j54vc1bk
mpeter50
mrpapersonic
Expand All @@ -160,7 +160,7 @@ PilzAdam
zmousm
iw0nderhow
unit193
TwoThousandHedgehogs
TwoThousandHedgehogs/KathrynElrod
Jertzukka
cypheron
Hyeeji
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ pypi-files: AUTHORS Changelog.md LICENSE README.md README.txt supportedsites com
clean-test:
rm -rf test/testdata/sigs/player-*.js tmp/ *.annotations.xml *.aria2 *.description *.dump *.frag \
*.frag.aria2 *.frag.urls *.info.json *.live_chat.json *.meta *.part* *.tmp *.temp *.unknown_video *.ytdl \
*.3gp *.ape *.avi *.desktop *.flac *.flv *.jpeg *.jpg *.m4a *.m4v *.mhtml *.mkv *.mov *.mp3 \
*.3gp *.ape *.ass *.avi *.desktop *.flac *.flv *.jpeg *.jpg *.m4a *.m4v *.mhtml *.mkv *.mov *.mp3 \
*.mp4 *.ogg *.opus *.png *.sbv *.srt *.swf *.swp *.ttml *.url *.vtt *.wav *.webloc *.webm *.webp
clean-dist:
rm -rf yt-dlp.1.temp.md yt-dlp.1 README.txt MANIFEST build/ dist/ .coverage cover/ yt-dlp.tar.gz completions/ \
Expand Down
53 changes: 28 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ yt-dlp is a [youtube-dl](https://github.com/ytdl-org/youtube-dl) fork based on t

* **Other new options**: Many new options have been added such as `--concat-playlist`, `--print`, `--wait-for-video`, `--sleep-requests`, `--convert-thumbnails`, `--write-link`, `--force-download-archive`, `--force-overwrites`, `--break-on-reject` etc

* **Improvements**: Regex and other operators in `--match-filter`, multiple `--postprocessor-args` and `--downloader-args`, faster archive checking, more [format selection options](#format-selection), merge multi-video/audio, multiple `--config-locations`, `--exec` at different stages, etc
* **Improvements**: Regex and other operators in `--format`/`--match-filter`, multiple `--postprocessor-args` and `--downloader-args`, faster archive checking, more [format selection options](#format-selection), merge multi-video/audio, multiple `--config-locations`, `--exec` at different stages, etc

* **Plugins**: Extractors and PostProcessors can be loaded from an external file. See [plugins](#plugins) for details

Expand All @@ -130,7 +130,7 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
* The default [format sorting](#sorting-formats) is different from youtube-dl and prefers higher resolution and better codecs rather than higher bitrates. You can use the `--format-sort` option to change this to any order you prefer, or use `--compat-options format-sort` to use youtube-dl's sorting order
* The default format selector is `bv*+ba/b`. This means that if a combined video + audio format that is better than the best video-only format is found, the former will be preferred. Use `-f bv+ba/b` or `--compat-options format-spec` to revert this
* Unlike youtube-dlc, yt-dlp does not allow merging multiple audio/video streams into one file by default (since this conflicts with the use of `-f bv*+ba`). If needed, this feature must be enabled using `--audio-multistreams` and `--video-multistreams`. You can also use `--compat-options multistreams` to enable both
* `--ignore-errors` is enabled by default. Use `--abort-on-error` or `--compat-options abort-on-error` to abort on errors instead
* `--no-abort-on-error` is enabled by default. Use `--abort-on-error` or `--compat-options abort-on-error` to abort on errors instead
* When writing metadata files such as thumbnails, description or infojson, the same information (if available) is also written for playlists. Use `--no-write-playlist-metafiles` or `--compat-options no-playlist-metafiles` to not write these files
* `--add-metadata` attaches the `infojson` to `mkv` files in addition to writing the metadata when used with `--write-info-json`. Use `--no-embed-info-json` or `--compat-options no-attach-info-json` to revert this
* Some metadata are embedded into different fields when using `--add-metadata` as compared to youtube-dl. Most notably, `comment` field contains the `webpage_url` and `synopsis` contains the `description`. You can [use `--parse-metadata`](#modifying-metadata) to modify this to your liking or use `--compat-options embed-metadata` to revert this
Expand Down Expand Up @@ -267,7 +267,7 @@ While all the other dependencies are optional, `ffmpeg` and `ffprobe` are highly
* [**pycryptodomex**](https://github.com/Legrandin/pycryptodome) - For decrypting AES-128 HLS streams and various other data. Licensed under [BSD2](https://github.com/Legrandin/pycryptodome/blob/master/LICENSE.rst)
* [**websockets**](https://github.com/aaugustin/websockets) - For downloading over websocket. Licensed under [BSD3](https://github.com/aaugustin/websockets/blob/main/LICENSE)
* [**secretstorage**](https://github.com/mitya57/secretstorage) - For accessing the Gnome keyring while decrypting cookies of Chromium-based browsers on Linux. Licensed under [BSD](https://github.com/mitya57/secretstorage/blob/master/LICENSE)
* [**AtomicParsley**](https://github.com/wez/atomicparsley) - For embedding thumbnail in mp4/m4a if mutagen is not present. Licensed under [GPLv2+](https://github.com/wez/atomicparsley/blob/master/COPYING)
* [**AtomicParsley**](https://github.com/wez/atomicparsley) - For embedding thumbnail in mp4/m4a if mutagen/ffmpeg cannot. Licensed under [GPLv2+](https://github.com/wez/atomicparsley/blob/master/COPYING)
* [**brotli**](https://github.com/google/brotli) or [**brotlicffi**](https://github.com/python-hyper/brotlicffi) - [Brotli](https://en.wikipedia.org/wiki/Brotli) content encoding support. Both licensed under MIT <sup>[1](https://github.com/google/brotli/blob/master/LICENSE) [2](https://github.com/python-hyper/brotlicffi/blob/master/LICENSE) </sup>
* [**rtmpdump**](http://rtmpdump.mplayerhq.hu) - For downloading `rtmp` streams. ffmpeg will be used as a fallback. Licensed under [GPLv2+](http://rtmpdump.mplayerhq.hu)
* [**mplayer**](http://mplayerhq.hu/design7/info.html) or [**mpv**](https://mpv.io) - For downloading `rstp` streams. ffmpeg will be used as a fallback. Licensed under [GPLv2+](https://github.com/mpv-player/mpv/blob/master/Copyright)
Expand All @@ -279,6 +279,7 @@ To use or redistribute the dependencies, you must agree to their respective lice

The Windows and MacOS standalone release binaries are already built with the python interpreter, mutagen, pycryptodomex and websockets included.

<!-- TODO: ffmpeg has merged this patch. Remove this note once there is new release -->
**Note**: There are some regressions in newer ffmpeg versions that causes various issues when used alongside yt-dlp. Since ffmpeg is such an important dependency, we provide [custom builds](https://github.com/yt-dlp/FFmpeg-Builds#ffmpeg-static-auto-builds) with patches for these issues at [yt-dlp/FFmpeg-Builds](https://github.com/yt-dlp/FFmpeg-Builds). See [the readme](https://github.com/yt-dlp/FFmpeg-Builds#patches-applied) for details on the specific issues solved by these builds


Expand Down Expand Up @@ -606,11 +607,11 @@ You can also fork the project on github and run your fork's [build workflow](.gi
--write-description etc. (default)
--no-write-playlist-metafiles Do not write playlist metadata when using
--write-info-json, --write-description etc.
--clean-infojson Remove some private fields such as
--clean-info-json Remove some private fields such as
filenames from the infojson. Note that it
could still contain some personal
information (default)
--no-clean-infojson Write all fields to the infojson
--no-clean-info-json Write all fields to the infojson
--write-comments Retrieve video comments to be placed in the
infojson. The comments are fetched even
without this option if the extraction is
Expand Down Expand Up @@ -1599,25 +1600,28 @@ This option also has a few special uses:
* You can download an additional URL based on the metadata of the currently downloaded video. To do this, set the field `additional_urls` to the URL that you want to download. Eg: `--parse-metadata "description:(?P<additional_urls>https?://www\.vimeo\.com/\d+)` will download the first vimeo video found in the description
* You can use this to change the metadata that is embedded in the media file. To do this, set the value of the corresponding field with a `meta_` prefix. For example, any value you set to `meta_description` field will be added to the `description` field in the file. For example, you can use this to set a different "description" and "synopsis". To modify the metadata of individual streams, use the `meta<n>_` prefix (Eg: `meta1_language`). Any value set to the `meta_` field will overwrite all default values.

**Note**: Metadata modification happens before format selection, post-extraction and other post-processing operations. Some fields may be added or changed during these steps, overriding your changes.

For reference, these are the fields yt-dlp adds by default to the file metadata:

Metadata fields|From
:---|:---
`title`|`track` or `title`
`date`|`upload_date`
`description`, `synopsis`|`description`
`purl`, `comment`|`webpage_url`
`track`|`track_number`
`artist`|`artist`, `creator`, `uploader` or `uploader_id`
`genre`|`genre`
`album`|`album`
`album_artist`|`album_artist`
`disc`|`disc_number`
`show`|`series`
`season_number`|`season_number`
`episode_id`|`episode` or `episode_id`
`episode_sort`|`episode_number`
`language` of each stream|From the format's `language`
Metadata fields | From
:--------------------------|:------------------------------------------------
`title` | `track` or `title`
`date` | `upload_date`
`description`, `synopsis` | `description`
`purl`, `comment` | `webpage_url`
`track` | `track_number`
`artist` | `artist`, `creator`, `uploader` or `uploader_id`
`genre` | `genre`
`album` | `album`
`album_artist` | `album_artist`
`disc` | `disc_number`
`show` | `series`
`season_number` | `season_number`
`episode_id` | `episode` or `episode_id`
`episode_sort` | `episode_number`
`language` of each stream | the format's `language`

**Note**: The file format may not support some of these fields


Expand Down Expand Up @@ -1816,12 +1820,11 @@ ydl_opts = {
}],
'logger': MyLogger(),
'progress_hooks': [my_hook],
# Add custom headers
'http_headers': {'Referer': 'https://www.google.com'}
}


# Add custom headers
yt_dlp.utils.std_headers.update({'Referer': 'https://www.google.com'})

# ℹ️ See the public functions in yt_dlp.YoutubeDL for for other available functions.
# Eg: "ydl.download", "ydl.download_with_info_file"
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
Expand Down
6 changes: 5 additions & 1 deletion devscripts/prepare_manpage.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,11 @@ def filter_options(readme):
section = re.search(r'(?sm)^# USAGE AND OPTIONS\n.+?(?=^# )', readme).group(0)
options = '# OPTIONS\n'
for line in section.split('\n')[1:]:
mobj = re.fullmatch(r'\s{4}(?P<opt>-(?:,\s|[^\s])+)(?:\s(?P<meta>([^\s]|\s(?!\s))+))?(\s{2,}(?P<desc>.+))?', line)
mobj = re.fullmatch(r'''(?x)
\s{4}(?P<opt>-(?:,\s|[^\s])+)
(?:\s(?P<meta>(?:[^\s]|\s(?!\s))+))?
(\s{2,}(?P<desc>.+))?
''', line)
if not mobj:
options += f'{line.lstrip()}\n'
continue
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
LONG_DESCRIPTION = '\n\n'.join((
'Official repository: <https://github.com/yt-dlp/yt-dlp>',
'**PS**: Some links in this document will not work since this is a copy of the README.md from Github',
open('README.md', 'r', encoding='utf-8').read()))
open('README.md').read()))

REQUIREMENTS = open('requirements.txt').read().splitlines()

Expand Down
2 changes: 2 additions & 0 deletions yt_dlp/YoutubeDL.py
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,8 @@ class YoutubeDL(object):
See "Sorting Formats" for more details.
format_sort_force: Force the given format_sort. see "Sorting Formats"
for more details.
prefer_free_formats: Whether to prefer video formats with free containers
over non-free ones of same quality.
allow_multiple_video_streams: Allow multiple video streams to be merged
into a single file
allow_multiple_audio_streams: Allow multiple audio streams to be merged
Expand Down
3 changes: 3 additions & 0 deletions yt_dlp/downloader/youtube_live_chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ class YoutubeLiveChatFD(FragmentFD):
def real_download(self, filename, info_dict):
video_id = info_dict['video_id']
self.to_screen('[%s] Downloading live chat' % self.FD_NAME)
if not self.params.get('skip_download'):
self.report_warning('Live chat download runs until the livestream ends. '
'If you wish to download the video simultaneously, run a separate yt-dlp instance')

fragment_retries = self.params.get('fragment_retries', 0)
test = self.params.get('test', False)
Expand Down
16 changes: 6 additions & 10 deletions yt_dlp/extractor/abematv.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,6 @@
from base64 import urlsafe_b64encode
from binascii import unhexlify

import typing
if typing.TYPE_CHECKING:
from ..YoutubeDL import YoutubeDL

from .common import InfoExtractor
from ..aes import aes_ecb_decrypt
from ..compat import (
Expand All @@ -36,24 +32,24 @@

# NOTE: network handler related code is temporary thing until network stack overhaul PRs are merged (#2861/#2862)

def add_opener(self: 'YoutubeDL', handler):
def add_opener(ydl, handler):
''' Add a handler for opening URLs, like _download_webpage '''
# https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L426
# https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L605
assert isinstance(self._opener, compat_urllib_request.OpenerDirector)
self._opener.add_handler(handler)
assert isinstance(ydl._opener, compat_urllib_request.OpenerDirector)
ydl._opener.add_handler(handler)


def remove_opener(self: 'YoutubeDL', handler):
def remove_opener(ydl, handler):
'''
Remove handler(s) for opening URLs
@param handler Either handler object itself or handler type.
Specifying handler type will remove all handler which isinstance returns True.
'''
# https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L426
# https://github.com/python/cpython/blob/main/Lib/urllib/request.py#L605
opener = self._opener
assert isinstance(self._opener, compat_urllib_request.OpenerDirector)
opener = ydl._opener
assert isinstance(ydl._opener, compat_urllib_request.OpenerDirector)
if isinstance(handler, (type, tuple)):
find_cp = lambda x: isinstance(x, handler)
else:
Expand Down
4 changes: 2 additions & 2 deletions yt_dlp/extractor/ant1newsgr.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,8 @@ def _real_extract(self, url):
embed_urls = list(Ant1NewsGrEmbedIE._extract_urls(webpage))
if not embed_urls:
raise ExtractorError('no videos found for %s' % video_id, expected=True)
return self.url_result_or_playlist_from_matches(
embed_urls, video_id, info['title'], ie=Ant1NewsGrEmbedIE.ie_key(),
return self.playlist_from_matches(
embed_urls, video_id, info.get('title'), ie=Ant1NewsGrEmbedIE.ie_key(),
video_kwargs={'url_transparent': True, 'timestamp': info.get('timestamp')})


Expand Down
Loading

0 comments on commit 08d3015

Please sign in to comment.