Skip to content

Conversation

anthony-zy
Copy link
Contributor

This PR add RVV to optimize Memcopy64 function , improving compression speed by ~46%.

Optimize Snappy 1.2.2 performance

Added RVV support for RISC-V in Snappy, optimizing Memcopy64 by leveraging RVV vector load/store instructions (e.g., vle8_v_u8m2, vse8_v_u8m2) to reduce memory copy overhead and improve decompression performance. lzbench 2.1 tests onsilesia.tar (GCC 13.2.1, 64-bit Linux) show:
Compressor Compress Decompress Size Ratio
Snappy 1.2.2 (Before) 64.3 MB/s 186 MB/s 101403263 47.84%
Snappy 1.2.2 (After) 64.4 MB/s 272 MB/s 101403263 47.84%

Snappy 1.2.2 unittest ([ PASSED ] 21 tests.)

Running main() from gmock_main.cc
[==========] Running 21 tests from 3 test suites.
[----------] Global test environment set-up.
[----------] 1 test from CorruptedTest
[ RUN ] CorruptedTest.VerifyCorrupted
Crazy decompression lengths not checked on 64-bit build
[ OK ] CorruptedTest.VerifyCorrupted (5 ms)
[----------] 1 test from CorruptedTest (5 ms total)

[----------] 17 tests from Snappy
[ RUN ] Snappy.SimpleTests
[ OK ] Snappy.SimpleTests (22 ms)
[ RUN ] Snappy.AppendSelfPatternExtensionEdgeCases
[ OK ] Snappy.AppendSelfPatternExtensionEdgeCases (3 ms)
[ RUN ] Snappy.AppendSelfPatternExtensionEdgeCasesExhaustive
[ OK ] Snappy.AppendSelfPatternExtensionEdgeCasesExhaustive (4830 ms)
[ RUN ] Snappy.MaxBlowup
[ OK ] Snappy.MaxBlowup (11 ms)
[ DISABLED ] Snappy.DISABLED_MoreThan4GB
[ RUN ] Snappy.RandomData
[ OK ] Snappy.RandomData (20311 ms)
[ RUN ] Snappy.FourByteOffset
[ OK ] Snappy.FourByteOffset (1 ms)
[ RUN ] Snappy.IOVecSourceEdgeCases
[ OK ] Snappy.IOVecSourceEdgeCases (0 ms)
[ RUN ] Snappy.IOVecSinkEdgeCases
[ OK ] Snappy.IOVecSinkEdgeCases (0 ms)
[ RUN ] Snappy.IOVecLiteralOverflow
[ OK ] Snappy.IOVecLiteralOverflow (0 ms)
[ RUN ] Snappy.IOVecCopyOverflow
[ OK ] Snappy.IOVecCopyOverflow (0 ms)
[ RUN ] Snappy.ReadPastEndOfBuffer
[ OK ] Snappy.ReadPastEndOfBuffer (0 ms)
[ RUN ] Snappy.ZeroOffsetCopy
[ OK ] Snappy.ZeroOffsetCopy (0 ms)
[ RUN ] Snappy.ZeroOffsetCopyValidation
[ OK ] Snappy.ZeroOffsetCopyValidation (0 ms)
[ RUN ] Snappy.FindMatchLength
[ OK ] Snappy.FindMatchLength (0 ms)
[ RUN ] Snappy.FindMatchLengthRandom
[ OK ] Snappy.FindMatchLengthRandom (1145 ms)
[ RUN ] Snappy.VerifyCharTable
[ OK ] Snappy.VerifyCharTable (0 ms)
[ RUN ] Snappy.TestBenchmarkFiles
[ OK ] Snappy.TestBenchmarkFiles (406 ms)
[----------] 17 tests from Snappy (26734 ms total)

[----------] 3 tests from SnappyCorruption
[ RUN ] SnappyCorruption.TruncatedVarint
[ OK ] SnappyCorruption.TruncatedVarint (0 ms)
[ RUN ] SnappyCorruption.UnterminatedVarint
[ OK ] SnappyCorruption.UnterminatedVarint (0 ms)
[ RUN ] SnappyCorruption.OverflowingVarint
[ OK ] SnappyCorruption.OverflowingVarint (0 ms)
[----------] 3 tests from SnappyCorruption (0 ms total)

[----------] Global test environment tear-down
[==========] 21 tests from 3 test suites ran. (26740 ms total)
[ PASSED ] 21 tests.

YOU HAVE 1 DISABLED TEST
[ PASSED ] 21 tests.

@anthony-zy
Copy link
Contributor Author

@danilak-G
I've revised the code, reducing branches and optimizing preprocessor directives for easier review. Could you check the current version when possible?
Also, in Snappy's maintenance mode, what's the review timeline and what changes are typically accepted? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant