-
Notifications
You must be signed in to change notification settings - Fork 467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop using .tar.bz, maybe?? #3503
Comments
What is the size difference between xz and bz2 when creating archives using Python? |
The compression method of the
Huh, much smaller. Decompression timing:
And much faster. |
While we are at it, the state of |
Then 7zip is simply better, which initially was used, but then rejected for some reason. .tar.bz or .tar.xz will always go slower, as it non-true "solid" archive which requires, depending on tools, to uncompress .bz and then extract .tar. I've expect acceptable interopability (as consumer) and this is not tar-variations on windows (i have no issues personally but generally is not ideal). Plain .zip is still winner in this sense. 7zip is right after it. Also 7zip used by chromium build so it should be on board at least for windows (it used for installer). |
We also need to consider what comes default-installed on most OSes, and what is supported by common tools like CMake and TeamCity. Also related to issue #2446 (symlink support). |
xz comes default installed on most OS. With tar it's always pure solid and has good encoding story (almost always UTF-8). The 2-level decompression is a result of how archive programs are designed on Windows: they are designed around showing file contents, instead of just a full streaming extraction. But since tar has no central directory, it takes a full decompression to show contents anyways. The point is, not tar's fault. See also M2Team/NanaZip#138 7z has a stronger encoding story (mandatory UTF-16), option to be selectively solid, but two issues: pre-installation (partially solved by zip is a fragmented mess. No solid support, okay pre-installation. Symlink support is possible via Info-ZIP extension but does not seem to be present in Python zipfile. |
To clarify, I'm not against .xz, it virtually same thing, so it provides also good compression ratio, which I'm welcomed. Also, Windows 10 has tar(bsdtar) on board, but it again, virtually useless, as it have only gzip support. And because of this - 7zip is winner, as it anyway third-party tool. |
First time hearing this! Interestingly, Ah you know what, let me throw something in the Feedback Hub. No idea if they read it. |
@Artoria2e5 mine tar requires bzip2.exe and it doesnt work cause bzip2 absent. Windows 10 also includes curl. Nice, but it compiled without zlib/gzip support, so it cant download compressed deflate stream. And i'm anyway using standalone curl. Agreed what it is kind of strange. :) |
Huh, Microsoft is now making the built-in bsdtar the basis of a new feature, it seems. https://www.bleepingcomputer.com/news/microsoft/windows-11-getting-native-support-for-7-zip-rar-and-gz-archives/ I got "working on it" tagged in the Feedback Hub, so they are putting some work in it. |
I support this. |
Describe the bug
The current releases use CEF_ARCHIVE_FORMAT set to tarbz. This is extremely slow to decompress. Bzip2 unpacks slower than xz and does not even compress better.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
We could really use xz to get at least double the decompression speed. Or even zstd, at the cost of worse compression. These two are extremely widespread.
Screenshots
Versions (please complete the following information):
Additional context
Python
tarfile
hasxz
support since 3.3. You don't even need to get an external program!The text was updated successfully, but these errors were encountered: