
A fast, lightweight C library for reading ZIM archive files. ZIM is the format used by Kiwix to store offline web content such as Wikipedia, Stack Exchange, and more.
czim provides a C11 API for reading ZIM archives with support for:
- Streaming decompression — Incremental Zstandard and LZMA decompression for efficient memory usage
- Path and title lookup — Binary search with NarrowDown sparse index for fast O(log n) entry retrieval
- Prefix search — Range-based prefix matching for both paths and titles
- Redirect resolution — Automatic redirect chain following
- Blob access — On-demand blob decompression with LRU cluster caching
- Metadata and illustrations — Access to archive metadata, favicon, and illustration images
# Clone with submodules
git clone --recursive https://github.com/unidict/czim.git
cd czim
# Configure
cmake -B build -DCZIM_BUILD_TESTS=ON
# Build
cmake --build build --config Release
# Test
ctest --test-dir build --verbose
# Install
cmake --install build --prefix /usr/local
| Option |
Default |
Description |
CZIM_BUILD_TESTS |
ON |
Build test suite |
BUILD_SHARED_LIBS |
OFF |
Build shared library |
#include "czim_archive.h"
#include "czim_archive_ext.h"
#include <stdio.h>
int main(void) {
// Open archive
czim_archive *archive = czim_archive_open("wikipedia.zim");
if (!archive) return 1;
// Query properties
printf("Entries: %u\n", czim_archive_entry_count(archive));
printf("Articles: %u\n", czim_archive_article_count(archive));
// Find entry by path
uint32_t index;
czim_entry *entry = czim_archive_find_entry_by_path(archive, 'C', "African_Americans", &index);
if (entry) {
printf("Found: %s\n", czim_entry_get_path(entry));
// Read blob data
if (!czim_entry_is_redirect(entry)) {
czim_blob blob = czim_archive_get_blob(archive, entry);
printf("Size: %zu bytes\n", blob.size);
czim_blob_free(&blob);
}
czim_entry_free(entry);
}
czim_archive_close(archive);
return 0;
}
| Function |
Description |
czim_archive_open() |
Open a ZIM archive |
czim_archive_close() |
Close an archive |
| Function |
Description |
czim_archive_uuid() |
Get archive UUID |
czim_archive_entry_count() |
Total number of entries |
czim_archive_article_count() |
Number of front articles |
czim_archive_cluster_count() |
Number of clusters |
czim_archive_has_main_entry() |
Check if main page is set |
czim_archive_has_title_index() |
Check if title index exists |
| Function |
Description |
czim_archive_get_entry_by_index() |
Get entry by path index |
czim_archive_find_entry_by_path() |
Find entry by path + namespace |
czim_archive_find_entry_by_title() |
Find entry by title + namespace |
czim_archive_resolve_redirect() |
Follow redirect chain |
czim_archive_find_metadata() |
Find metadata entry by name |
| Function |
Description |
czim_archive_find_entry_by_path_prefix() |
Find entry range by path prefix |
czim_archive_find_entry_by_title_prefix() |
Find entry range by title prefix |
| Function |
Description |
czim_archive_get_blob() |
Get blob data for a content entry |
See czim_archive.h and czim_archive_ext.h for the complete API documentation.
| Platform |
Status |
| macOS |
Tested |
| Linux |
Tested (CI) |
| Windows |
Tested (CI) |
MIT License. See LICENSE for details.