Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recompression #824

Open
wants to merge 15 commits into
base: recompression
Choose a base branch
from
Open

Recompression #824

wants to merge 15 commits into from

Conversation

riknel
Copy link

@riknel riknel commented Aug 7, 2020

Faster and more efficient recompression using previous compression artifacts such as backward references and block splits.

Copy link

@yoavweiss yoavweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few random comments

c/dec/decode.c Outdated Show resolved Hide resolved
c/compress_similar_files/compress_similar_files.c Outdated Show resolved Hide resolved
c/dec/state.c Outdated Show resolved Hide resolved
c/enc/block_splitter.c Show resolved Hide resolved
c/enc/cluster_inc.h Outdated Show resolved Hide resolved
c/enc/cluster_inc.h Outdated Show resolved Hide resolved
c/enc/encode.c Outdated Show resolved Hide resolved
Copy link

@yoavweiss yoavweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments that seem to coalesce to a few categories:

  • Style nits
  • Out-of-bound checks - you need to make sure that all your array reads and writes are done within bounds, even if e.g. the input to the program is invalid (e.g. an attacker sends a corrupted brotli file), or, to a lesser extent, if the input to a function is invalid (e.g. a future programmer calls it with the wrong values). The former case needs to be handled gracefully, without crashing. The latter case, needs to be assert() against, at the very least. If this requires work that goes beyond the prototype's scope, leave a TODO to make sure this happens before the code's productization.
  • Magic numbers - they make it hard to read the code and understand what it means. Constants with descriptive names are better
  • Prototype-level shortcuts - No need to fix those IMO, but you do need to call them out (e.g. overly large allocations)

Also, I see a bunch of allocations but no free() to go along with them...

c/compress_similar_files/compress_similar_files.c Outdated Show resolved Hide resolved
c/compress_similar_files/compress_similar_files.c Outdated Show resolved Hide resolved
c/compress_similar_files/compress_similar_files.c Outdated Show resolved Hide resolved
c/compress_similar_files/compress_similar_files.c Outdated Show resolved Hide resolved
c/compress_similar_files/compress_similar_files.c Outdated Show resolved Hide resolved
c/enc/hash.h Outdated Show resolved Hide resolved
c/enc/hash.h Show resolved Hide resolved
c/enc/hash_forgetful_chain_inc.h Show resolved Hide resolved
c/enc/hash_longest_match64_inc.h Show resolved Hide resolved
c/enc/hash_longest_match64_inc.h Show resolved Hide resolved
c/common/platform.h Outdated Show resolved Hide resolved
const uint8_t* word = &words->data[offset];
const BrotliTransforms* transforms = BrotliGetTransforms();
/* String can be sometimes longer then copy_len */
uint8_t* string = (uint8_t*)malloc(sizeof(uint8_t) * copy_len * EXCEED_COPY_LEN_RATE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to avoid warnings, you should add
#include <stdlib.h>
at the top of the file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants