Skip to content

Commit af68570

Browse files
Update README.md
1 parent 424787a commit af68570

File tree

1 file changed

+8
-2
lines changed

1 file changed

+8
-2
lines changed

README.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[![PyPI version](https://img.shields.io/pypi/v/pyfastpfor.svg)](https://pypi.python.org/pypi/pyfastpfor/)
22
[![Build Status](https://travis-ci.org/searchivarius/PyFastPFor.svg?branch=master)](https://travis-ci.org/searchivarius/PyFastPFor)
33
# PyFastPFor
4-
Python bindings for the fast integer compression library [FastPFor](https://github.com/lemire/FastPFor): A research library with integer compression schemes. FastPFor is broadly applicable to the compression of arrays of 32-bit integers where most integers are small. The library seeks to exploit SIMD instructions (SSE) whenever possible. This library can decode at least 4 billions of compressed integers per second on most desktop or laptop processors. That is, it can decompress data at a rate of 15 GB/s. This is significantly faster than generic codecs like gzip, LZO, Snappy or LZ4.
4+
Python bindings for the fast **light-weight** integer compression library [FastPFor](https://github.com/lemire/FastPFor): A research library with integer compression schemes. FastPFor is broadly applicable to the compression of arrays of 32-bit integers where most integers are small. The library seeks to exploit SIMD instructions (SSE) whenever possible. This library can decode at least 4 billions of compressed integers per second on most desktop or laptop processors. That is, it can decompress data at a rate of 15 GB/s. This is significantly faster than generic codecs like gzip, LZO, Snappy or LZ4.
55

66
# Authors
77

@@ -22,4 +22,10 @@ Due to some compilation quirks this currently seem to work with GCC only. I will
2222

2323
# Documentation
2424

25-
The library supports all the codecs implemented in the original [FastPFor](https://github.com/lemire/FastPFor) library by Feb 2018) as well as two types of data differencing approaches. The library compresses well only small integers. A common trick to deal with large numbers is to sort them and subsequently to encode the differences. Examples of three common use scenarios (no differencing, coarse and fine deltas) are outlined in [this Python notebook](python_bindings/examples.ipynb). To get a list of codecs, use the function ``getCodecList``.
25+
The library supports all the codecs implemented in the original [FastPFor](https://github.com/lemire/FastPFor) library by Feb 2018). To get a list of codecs, use the function ``getCodecList``. Typical light-weight compression works well only for small integers. When integers are large, data differencing is a common trick to make them small. In particular, we often deal with sorted lists of integers, which can be represented by differences by neighboring numbers.
26+
27+
The smallest differences (**fine** deltas) are between adjacent numbers. Respective differencing and difference inverting functions are ``delta1'' and ``prefixSum1''.
28+
29+
However, we can do reasonably well, we compute differences between numbers that are four positions apart (**coarse** deltas). Such differences can be computed and inverted more efficiently. Respective differencing and difference inverting functions are ``delta4'' and ``prefixSum4''.
30+
31+
Examples of three common use scenarios (no differencing, coarse and fine deltas) are outlined in [this Python notebook](python_bindings/examples.ipynb).

0 commit comments

Comments
 (0)