Tags: lsleonard/tiny-data-compression
Tags
Improve bit handling input for encoding 1. In tdString.c, moved the inline functions for bit output to td64_internal.h where they can also be used by functions in td64.c. 2. In td64.c, implemented bit output improvements for encode AdaptiveTextMode and encodeStringMode.
Encode and decode speed improvements 1. In td512.c, modified checkTextMode to return a code that causes use of extended string mode and improves compression speed. Also added ' (quote) to the text chars array. 2. In td512.c, modified checktd64 (name change from checkSingleValueMode) to return a code for expected random data to be processed as a failing block of 64 bytes in the same way that td64 would handle this block. Improves compression speed. 3. In td64.c, modified decodeAdaptiveTextMode to read one byte ahead to improve speed of processing dtbmPeekBits. This change can require one byte read beyond length of input array. Improves decode speed.
Vary string length in extended string mode In tdString.c, make the definition of string length and associated number of bits based on number of input values. For <= 64 values, length is 9 and bits are 3. For > 64 values, length is 17 and bits are 4. Longer strings are likely to be found in larger data sets.
Version 2 extends compression up to 512 bytes. 1. In td512.c, for 128 and more values, text and extended string modes are called for checked data. For other data, and for any remaining values from calls for 128 or more values, td64 is called. A minimum of 16 characters are compressed. 2. In tdString.c, extended string mode was modified to stop on the 65th unique value. This value is the last value output and the number of values at that point is returned. 3. In main.c, after decompression, the input file is verified against the decompressed output file.
PreviousNext