Co-authored by: glenn-slayden
Modified from c9a9ccf8a3
These improvements apply to reading the list of URLs from the file supplied via the `--batch-file` (`-a`) command line option.
1. Skip blank and empty lines in the file. Currently, lines with leading whitespace are only skipped when that whitespace is followed by a comment character (`#`, `;`, or `]`). This means that empty lines and lines consisting only of whitespace are returned as (trimmed) empty strings in the list of URLs to process.
2. [bug fix] Detect and remove the Unicode BOM when the file descriptor is already decoding Unicode.
With Python 3, the `batch_fd` enumerator returns the lines of the file as Unicode. For UTF-8, this means that the raw BOM bytes from the file `\xef \xbb \xbf` show up converted into a single `\ufeff` character prefixed to the first enumerated text line.
This fix solves several buggy interactions between the presence of BOM, the skipping of comments and/or blank lines, and ensuring the list of URLs is consistently trimmed. For example, if the first line of the file is blank, the BOM is incorrectly returned as a URL standing alone. If the first line contains a URL, it will be prefixed with this unwanted single character--but note that its being there will have inhibited the proper trimming of any leading whitespace. Currently, the `UnicodeBOMIE` helper attempts to recover from some of these error cases, but this fix prevents the error from happening in the first place (at least on Python3). In any case, the `UnicodeBOMIE` approach is flawed, because it is clearly illogical for a BOM to appear in the (non-batch) URL(s) specified directly on the command line (and for that matter, on URLs *after the first line* of a batch list, also)
3. Adds proper trimming of the " #" into the read_batch_urls processing so that the URLs it enumerates are cleaned and trimmed more consistently.
Sometimes, video files will arrive with a timecode data stream
that causes `-map 0` to error out due to the stream not being
supported in the output container. These data streams generally do
not matter, so tell ffmpeg to ignore them rather than choking on
them.
* Enabled --ignore by default
* Disabled --video-multistreams and --audio-multistreams by default
* Changed default format selection to 'bv*+ba/b' when --audio-multistreams is disabled
* Changed default format sort order to 'res,fps,codec,size,br,asr,proto,ext,has_audio,source,format_id'
* Changed default output template to '%(title)s [%(id)s].%(ext)s'
* Enabled `--list-formats-as-table` by default
* The idea is that it should be possible to negate any boolean option by adding a `no-` to the switch
New: `--no-ignore-dynamic-mpd`, `--no-allow-dynamic-mpd`, `--allow-dynamic-mpd`, `--youtube-include-hls-manifest`, `--no-youtube-include-hls-manifest`, `--no-youtube-skip-hls-manifest`, `--no-download`, `--no-download-archive`, `--resize-buffer`, `--part`, `--mtime`, `--no-keep-fragments`, `--no-cookies`, `--no-write-annotations`, `--no-write-info-json`, `--no-write-description`, `--no-write-thumbnail`, `--youtube-include-dash-manifest`, `--post-overwrites`, `--no-keep-video`, `--no-embed-subs`, `--no-embed-thumbnail`, `--no-add-metadata`, `--no-include-ads`, `--no-write-sub`, `--no-write-auto-sub`, `--no-playlist-reverse`, `--no-restrict-filenames`, `--youtube-include-dash-manifest`, `--no-format-sort-force`, `--flat-videos`, `--no-list-formats-as-table`, `--no-sponskrub`, `--no-sponskrub-cut`, `--no-sponskrub-force`
Renamed: `--write-subs`, --no-write-subs`, `--no-write-auto-subs, `--write-auto-subs`. Note that these can still be used without the ending "s"
* Added options: --video-multistreams, --no-video-multistreams, --audio-multistreams, --no-audio-multistreams
* New format selectors: best*, worst*, bestvideo*, bestaudio*, worstvideo*, worstaudio*
* Added b,w,v,a as alias for best, worst, video and audio respectively in format selection
* Changed video format sorting to show video only files and video+audio files together.
Old Extractors left behind:
VLivePlaylistIE
YoutubeSearchURLIE
YoutubeShowIE
YoutubeFavouritesIE
If removing old extractors, make corresponding changes in
docs/supportedsites.md
youtube_dlc/extractor/extractors.py
Not merged:
.github/ISSUE_TEMPLATE/1_broken_site.md
.github/ISSUE_TEMPLATE/2_site_support_request.md
.github/ISSUE_TEMPLATE/3_site_feature_request.md
.github/ISSUE_TEMPLATE/4_bug_report.md
.github/ISSUE_TEMPLATE/5_feature_request.md
test/test_all_urls.py
youtube_dlc/version.py
Changelog
deobfuscates the video URL using a reverse engineered version of KVS
player's algorithm. This was tested against version 4.0.4, 5.0.1,
5.1.1.4 and 5.2.0.4 of the player and a warning will be issued if the
major version changes.
We already had a few copies of Polymer-style pagination handling logic
for certain circumstances, but now we're forced into using it for all
playlists since we can no longer disable Polymer. Refactor the logic to
move it to the parent class for all entry lists (including e.g. search
results, feeds, and list of playlists), and generify a bit to cover the
child classes' use cases.
Not my changes, but from @franhp that didn't get merged on yt-dl on time
It supports BTCC new pages' schema from 2019 an on (/articles/ instead of /races/)
live_chat_continuation['continuations'][0]['liveChatReplayContinuationData']['continuation'] can not exist.
So catch the KeyError.
Traceback:
$ tubeup 'https://youtube.com/watch?v=JyE9OF03cao'
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dlc version 2020.10.25
[debug] Python version 3.7.3 (CPython) - Linux-5.8.0-0.bpo.2-amd64-x86_64-with-debian-10.6
[debug] exe versions: ffmpeg 3.3.9, ffprobe 3.3.9
[debug] Proxy map: {}
There are no annotations to write.
[download] 452.59KiB at 615.35KiB/s (00:01)ERROR: 'liveChatReplayContinuationData'
Traceback (most recent call last):
File "/mnt/data2/Backup/Wiki/.local/lib/python3.7/site-packages/youtube_dlc/YoutubeDL.py", line 846, in extract_info
return self.process_ie_result(ie_result, download, extra_info)
File "/mnt/data2/Backup/Wiki/.local/lib/python3.7/site-packages/youtube_dlc/YoutubeDL.py", line 901, in process_ie_result
return self.process_video_result(ie_result, download=download)
File "/mnt/data2/Backup/Wiki/.local/lib/python3.7/site-packages/youtube_dlc/YoutubeDL.py", line 1696, in process_video_result
self.process_info(new_info)
File "/mnt/data2/Backup/Wiki/.local/lib/python3.7/site-packages/youtube_dlc/YoutubeDL.py", line 1894, in process_info
dl(sub_filename, sub_info, subtitle=True)
File "/mnt/data2/Backup/Wiki/.local/lib/python3.7/site-packages/youtube_dlc/YoutubeDL.py", line 1866, in dl
return fd.download(name, info, subtitle)
File "/mnt/data2/Backup/Wiki/.local/lib/python3.7/site-packages/youtube_dlc/downloader/common.py", line 375, in download
return self.real_download(filename, info_dict)
File "/mnt/data2/Backup/Wiki/.local/lib/python3.7/site-packages/youtube_dlc/downloader/youtube_live_chat.py", line 85, in real_download
continuation_id = live_chat_continuation['continuations'][0]['liveChatReplayContinuationData']['continuation']
KeyError: 'liveChatReplayContinuationData'
This can happen if another software is using yt-dlc'API (ie: tubeup).
The stack trace would be:
$ tubeup 'https://youtube.com/watch?v=JyE9OF03cao'
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dlc version 2020.10.25
[debug] Python version 3.7.3 (CPython) - Linux-5.8.0-0.bpo.2-amd64-x86_64-with-debian-10.6
[debug] exe versions: ffmpeg 3.3.9, ffprobe 3.3.9
[debug] Proxy map: {}
There are no annotations to write.
ERROR: '>' not supported between instances of 'NoneType' and 'int'
Traceback (most recent call last):
File "/mnt/data2/Backup/Wiki/.local/lib/python3.7/site-packages/youtube_dlc/YoutubeDL.py", line 846, in extract_info
return self.process_ie_result(ie_result, download, extra_info)
File "/mnt/data2/Backup/Wiki/.local/lib/python3.7/site-packages/youtube_dlc/YoutubeDL.py", line 901, in process_ie_result
return self.process_video_result(ie_result, download=download)
File "/mnt/data2/Backup/Wiki/.local/lib/python3.7/site-packages/youtube_dlc/YoutubeDL.py", line 1696, in process_video_result
self.process_info(new_info)
File "/mnt/data2/Backup/Wiki/.local/lib/python3.7/site-packages/youtube_dlc/YoutubeDL.py", line 1894, in process_info
dl(sub_filename, sub_info, subtitle=True)
File "/mnt/data2/Backup/Wiki/.local/lib/python3.7/site-packages/youtube_dlc/YoutubeDL.py", line 1866, in dl
return fd.download(name, info, subtitle)
File "/mnt/data2/Backup/Wiki/.local/lib/python3.7/site-packages/youtube_dlc/downloader/common.py", line 367, in download
if self.params.get('sleep_interval_subtitles') > 0:
TypeError: '>' not supported between instances of 'NoneType' and 'int'