Skip to content

Glavo/kala-compress

Repository files navigation

Kala Compress

Gradle Check

This project is based on Apache Commons Compress. Kala Compress has made some improvements on its basis: Modularization (JPMS Support), NIO2 Path API support, etc.

Another important improvement of Kala Compress is that it does not depend on libraries such as commons-io and commons-lang3. Its core jar has no dependencies and is less than 90KiB in size, making it suitable for programs that have requirements on program size.

Its API is mostly consistent with Apache Commons Compress, with a few incompatibilities. So I renamed the package (and the module name) from org.apache.commons.compress to kala.compress. Therefore, it can coexist with Apache Commons Compress without conflict.

We assume that you already know about Commons Compress. If not, please refer to the User Guide first.

To add Kala Compress as a dependency, see section Modules.

Different from Apache Commons Compress

Modularization (JPMS Support)

Kala Compress has been fully modularized and now fully supports the the JPMS (Java Platform Module System).

Each compressor and archive is split into a separate artifact with a separate module name, you can optionally add dependencies on some of them without importing the entire Kala Compress. (The size of Kala Compress core jar is less than 90KB)

ArchiveStreamFactory and CompressorStreamFactory have been refactored internally so that they no longer have hard dependencies on all compressors and archivers, but instead look them up dynamically at runtime.

Each module provides its module-info.class, so it can work well with jlink.

For more information about the Kala Compress modules, see Modules.

Charset

Kala Compress has been completely refactored internally to use java.nio.charset.Charset to represent encoding. All methods that accept an encoding represented String then accept Charset. If you are using String to represent the encoding, use kala.compress.utils.Charsets.toCharset(String) to convert it to Charset.

ZipEncoding has been removed, please switch to Charset.

CharsetNames has been removed, please switch to StandardCharsets.

Kala Compress no longer uses Charset.defaultCharset(), but uses UTF-8 as an alternative. Note that file.encoding defaults to UTF-8 since Java 18. When you want to use platform native encoding, use the kala.compress.utils.Charsets.nativeCharset() explicitly as the alternative.

In addition, APIs that accept encoding represented by String now no longer fall back to the default character set when the encoding is not supported or invalid. Now they throw exceptions just like Charset.forName. (The behavior when null is passed in is not affected, it will still fall back to the UTF-8)

NIO2 Support

Most of the java.io.File-based APIs in commons-compress have been removed, please use the java.nio.file.Path-based APIs.

Rename

ZipFile has been renamed to ZipArchiveReader.

TarFile has been renamed to TarArchiveReader.

SevenZFile and SevenZOutputFile has been renamed to SevenZArchiveReader and SevenZArchiveWriter.

The reason for this is that I want to reserve names like [Archive]File for a more full-featured support class in the future. It should be able to support both reading and writing archives, adding or deleting entries, etc.

Deprecation and removal

Most deprecated APIs in Apache Commons Compress have been removed.

Unlike commons-compress, the constructors of ZipArchiveReader/SevenZArchiveReader are not deprecated, so there is no need to use lengthy builder syntax for simple requirements.

Additional support for OSGI is no longer provided, but this shouldn't make a big difference.

ZipEncoding and CharsetNames has been removed, please switch to Charset and StandardCharsets.

All methods that accept encoding represented by String have been removed, please use the Charset instead.

All methods that accept java.util.Date have been removed, please use the java.nio.file.attribute.FileTime instead.

Since Security Manager will be removed from JDK in the future, Kala Compress no longer use it. For more details, see JEP 411: Deprecate the Security Manager for Removal.

Since finalize method will be removed from JDK in the future, Kala Compress no longer used to clean up resources. For more details, see JEP 421: Deprecate Finalization for Removal. The archiveName in the ZipFile constructor is only used for error reporting in finalize, so it is removed together.

Most methods that accept the File have been removed, please use the Path instead.

Modules

Note: Kala Compress is in beta phase. Although it is developed based on mature Apache Commons Compress and has passed all tests, it may still be unstable. I may need to make some adjustments to the API before releasing to production.

The latest Kala Compress version is 1.27.1-1.

You can add dependencies on Kala Compress modules as follows:

Maven:

<dependency>
  <groupId>org.glavo.kala</groupId>
  <artifactId>${kala-compress-module-name}</artifactId>
  <version>1.27.1-1</version>
</dependency>

Gradle:

dependencies {
  implementation("org.glavo.kala:${kala-compress-module-name}:1.27.1-1")
}

All Kala Compress modules are listed below.

This is an empty module, which declares the transitivity dependency on all modules of Kala Compress. You can use all the contents of Kala Compress only by adding dependencies on it.

It is the basic module of Kala Compress, and all other modules depend on it.

It contains the following packages:

  • (package) kala.compress
  • (package) kala.compress.archivers
  • (package) kala.compress.compressors
  • (package) kala.compress.compressors.lz77support
  • (package) kala.compress.compressors.lzw
  • (package) kala.compress.compressors.parallel
  • (package) kala.compress.compressors.utils

It is an empty module that contains transitive dependencies on all compressor modules. You can include all compressors by adding a dependency on it.

In addition, each compressor in Kala Compress has a separate module, and you can add dependencies on one or all of them separately. Here is a list of compressors:

Here are some notes:

  • Different from Apache Commons Compress, the brotli compressor has no external dependencies. It copies the Google Brotli code into package kala.compress.compressors.brotli.dec, The reason for this is that Google Brotli does not support JPMS.
  • The lzma compressor and the xz compressor needs XZ for Java to work.
  • The zstandard compressor needs Zstd JNI to work.

It is an empty module that contains transitive dependencies on all archiver modules. You can include all archivers by adding a dependency on it.

In addition, each archiver in Kala Compress has a separate module, and you can add dependencies on one or all of them separately. Here is a list of archivers:

Here are some notes:

  • The sevenz archiver needs XZ for Java to work.
  • The sevenz archiver and the zip archiver have optional dependencies on the bzip2 compressor and the deflate64 compressor. They can work without these compressors, but errors will occur when they are required.
  • Support for jar (in package kala.compress.archivers.jar) is in the module kala.compress.archivers.zip.

It contains the package kala.compress.changes.

It contains the package kala.compress.archivers.examples.

Bug Report

If you encounter problems using it, please open an issue.

If it's an issue upstream of Apache Commons Compress, it's best to give feedback here and I'll port the upstream fix here.