-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writing compressed output using JSON writer #17323
Writing compressed output using JSON writer #17323
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving trivial CMake changes
Plots showing throughput performance of SNAPPY and GZIP compression and decompression in libcudf for JSON inputs. The missing bars for SNAPPY is because of failures in nvcomp SNAPPY compression for data sizes larger than Benchmark used to generate these plots: #17334 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature needs more tests, but it could be in another PR.
Looks good to me.
Thank you for running these! I now feel more comfortable with moving forward with this PR; I expected even worse performance from device compression 😁 |
Could we parametrize some of the existing tests to cover different compression formats? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there, just a few details in the compress_snappy
function to sort out.
Yes, most of the tests in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great stuff; huge increase in test coverage!
…json-compressed-writer
/merge |
Depends on #17161 for implementations of compression and decompression functions (`io/comp/comp.cu`, `io/comp/comp.hpp`, `io/comp/io_uncomp.hpp` and `io/comp/uncomp.cpp`)\ Depends on #17323 for compressed JSON writer implementation. Adds benchmark to measure performance of the JSON reader for compressed inputs. Authors: - Shruti Shivakumar (https://github.com/shrshi) - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - MithunR (https://github.com/mythrocks) - Vukasin Milovanovic (https://github.com/vuule) - Karthikeyan (https://github.com/karthikeyann) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #17219
Description
Depends on #17161 for implementations of compression and decompression functions (
io/comp/comp.cu
,io/comp/comp.hpp
,io/comp/io_uncomp.hpp
andio/comp/uncomp.cpp
)Adds support for writing GZIP- and SNAPPY-compressed JSON to the JSON writer.
Verifies correctness using a parameterized test in
tests/io/json/json_writer.cpp
Checklist