feat(RISC-V): Add RVV-optimized implementation for memcopy64 #213
+95
−12
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR add RVV to optimize Memcopy64 function , improving compression speed by ~46%.
Optimize Snappy 1.2.2 performance
Added RVV support for RISC-V in Snappy, optimizing Memcopy64 by leveraging RVV vector load/store instructions (e.g., vle8_v_u8m2, vse8_v_u8m2) to reduce memory copy overhead and improve decompression performance. lzbench 2.1 tests onsilesia.tar (GCC 13.2.1, 64-bit Linux) show:Snappy 1.2.2 unittest ([ PASSED ] 21 tests.)
Running main() from gmock_main.cc
[==========] Running 21 tests from 3 test suites.
[----------] Global test environment set-up.
[----------] 1 test from CorruptedTest
[ RUN ] CorruptedTest.VerifyCorrupted
Crazy decompression lengths not checked on 64-bit build
[ OK ] CorruptedTest.VerifyCorrupted (5 ms)
[----------] 1 test from CorruptedTest (5 ms total)
[----------] 17 tests from Snappy
[ RUN ] Snappy.SimpleTests
[ OK ] Snappy.SimpleTests (22 ms)
[ RUN ] Snappy.AppendSelfPatternExtensionEdgeCases
[ OK ] Snappy.AppendSelfPatternExtensionEdgeCases (3 ms)
[ RUN ] Snappy.AppendSelfPatternExtensionEdgeCasesExhaustive
[ OK ] Snappy.AppendSelfPatternExtensionEdgeCasesExhaustive (4830 ms)
[ RUN ] Snappy.MaxBlowup
[ OK ] Snappy.MaxBlowup (11 ms)
[ DISABLED ] Snappy.DISABLED_MoreThan4GB
[ RUN ] Snappy.RandomData
[ OK ] Snappy.RandomData (20311 ms)
[ RUN ] Snappy.FourByteOffset
[ OK ] Snappy.FourByteOffset (1 ms)
[ RUN ] Snappy.IOVecSourceEdgeCases
[ OK ] Snappy.IOVecSourceEdgeCases (0 ms)
[ RUN ] Snappy.IOVecSinkEdgeCases
[ OK ] Snappy.IOVecSinkEdgeCases (0 ms)
[ RUN ] Snappy.IOVecLiteralOverflow
[ OK ] Snappy.IOVecLiteralOverflow (0 ms)
[ RUN ] Snappy.IOVecCopyOverflow
[ OK ] Snappy.IOVecCopyOverflow (0 ms)
[ RUN ] Snappy.ReadPastEndOfBuffer
[ OK ] Snappy.ReadPastEndOfBuffer (0 ms)
[ RUN ] Snappy.ZeroOffsetCopy
[ OK ] Snappy.ZeroOffsetCopy (0 ms)
[ RUN ] Snappy.ZeroOffsetCopyValidation
[ OK ] Snappy.ZeroOffsetCopyValidation (0 ms)
[ RUN ] Snappy.FindMatchLength
[ OK ] Snappy.FindMatchLength (0 ms)
[ RUN ] Snappy.FindMatchLengthRandom
[ OK ] Snappy.FindMatchLengthRandom (1145 ms)
[ RUN ] Snappy.VerifyCharTable
[ OK ] Snappy.VerifyCharTable (0 ms)
[ RUN ] Snappy.TestBenchmarkFiles
[ OK ] Snappy.TestBenchmarkFiles (406 ms)
[----------] 17 tests from Snappy (26734 ms total)
[----------] 3 tests from SnappyCorruption
[ RUN ] SnappyCorruption.TruncatedVarint
[ OK ] SnappyCorruption.TruncatedVarint (0 ms)
[ RUN ] SnappyCorruption.UnterminatedVarint
[ OK ] SnappyCorruption.UnterminatedVarint (0 ms)
[ RUN ] SnappyCorruption.OverflowingVarint
[ OK ] SnappyCorruption.OverflowingVarint (0 ms)
[----------] 3 tests from SnappyCorruption (0 ms total)
[----------] Global test environment tear-down
[==========] 21 tests from 3 test suites ran. (26740 ms total)
[ PASSED ] 21 tests.
YOU HAVE 1 DISABLED TEST
[ PASSED ] 21 tests.