-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: Verifying we can safely reuse the context? #33
Comments
You can reuse a context (i.e. Compressor/Decomressor instance) with a dictionary as well as context without a dictionary, while respecting the condition that only one thread accesses one context at a time. Actually, we use small amount of [ThreadStatic]Compressor instances with dict under fairly heavy loads in production for mass bulk processing, and it works well. The only thing I think is important to note, even though the Zstd documentation says "re-using context is just a speed / resource optimization, it doesn't change the compression ratio, which remains identical", strictly speaking, the documentation doesn't explicitly say whether there might be side channels of information leakage between compressions/decompressions with single context. So if you are using compression of some sort of user data, I would recommend clarifying this point with the authors of Zstd. Or reuse the context only within signle user as a measure of defense-in-depth technique |
Thanks, we deployed this pooling approach and it seems to hold up well! 😄 Unfortunately I don't have time at the moment to extract out a minimal repro, but as a heads up we did run into an aggressive unmanaged memory leak when compressing with a dictionary following this approach every request (e.g. no pooling), and it happened at very low rps (1-5). We aren't able to rule out fragmentation, and we tried to dig into the native code but weren't able to identify what could be causing a leak even with the suboptimal approach. We weren't able to repro it on our dev machines even at much higher rps (native on macOS Apple Silicon), but we observed it running containerized on our Intel production machines. In the container we're building libzstd 1.5.6 from source and ensuring ZstdNet uses it. Not sure if others have run into this / perhaps you have some theories on what could cause it. For now though the pooling approach works well to bound our memory usage and handle this more efficiently overall, but we were pretty stumped when trying to narrow down what could have caused it. |
@neiljohari CompressionOptions class is thread-safe. Do you reuse instance of this class? |
Hello, thanks for the great library!
We wanted to verify we aren't running into any UB here.
We are using an ObjectPool of compressors like in this comment #31 (comment) with a small tweak that we're using a custom PoolObjectPolicy and passing in a dictionary:
We then call
Wrap
multiple times serially on the same compressor.The question is whether there's any sort of reset on the context necessary? Following all the
Wrap
calls, this eventually calls:The docs seem to say you can reuse a
ZSTD_CDict*
with no issue in the bulk processing dictionary API + in the beginning it says we can reuse a context multiple times:Just making sure this is an accurate understanding of how the lib is meant to be used?
The text was updated successfully, but these errors were encountered: