Skip to content

Comments

Fix murmur32 on large strings#1748

Merged
ranshid merged 1 commit intovalkey-io:unstablefrom
secwall:fix-murmur32
Feb 23, 2025
Merged

Fix murmur32 on large strings#1748
ranshid merged 1 commit intovalkey-io:unstablefrom
secwall:fix-murmur32

Conversation

@secwall
Copy link
Contributor

@secwall secwall commented Feb 17, 2025

Current murmur32 implementation fails on large strings.
Way to reproduce crash:

eval 'local s = string.rep("a", 1024 * 1024 * 1024) return #cjson.encode(s..s..s)' 0

(Basically this is a copy-paste from large test in tests/unit/scripting.tcl).

Shouldn't we run tests with --large-memory on daily CI so we could find this earlier?

I think we need to backport this to 8.1 branch.

Signed-off-by: secwall <secwall@yandex-team.ru>
@codecov
Copy link

codecov bot commented Feb 17, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.02%. Comparing base (e6e65c6) to head (07aa76c).
Report is 9 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1748      +/-   ##
============================================
- Coverage     71.09%   71.02%   -0.08%     
============================================
  Files           123      123              
  Lines         65531    65531              
============================================
- Hits          46590    46542      -48     
- Misses        18941    18989      +48     

see 17 files with indirect coverage changes

@ranshid ranshid added the bug Something isn't working label Feb 17, 2025
@ranshid
Copy link
Member

ranshid commented Feb 17, 2025

I think we need to backport this to 8.1 branch.

This will probably be taken to RC2

@ranshid
Copy link
Member

ranshid commented Feb 17, 2025

Shouldn't we run tests with --large-memory on daily CI so we could find this earlier?

good point. let me take it with the team and we will consider adding this

ranshid
ranshid previously approved these changes Feb 17, 2025
Copy link
Member

@ranshid ranshid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ranshid
Copy link
Member

ranshid commented Feb 17, 2025

BTW @secwall - I run:

./runtest --large-memory --single unit/scripting

but it did not fail. does it fail for you?

@ranshid ranshid self-requested a review February 17, 2025 14:56
@ranshid ranshid dismissed their stale review February 17, 2025 14:58

pending test verification

@secwall
Copy link
Contributor Author

secwall commented Feb 17, 2025

BTW @secwall - I run:

./runtest --large-memory --single unit/scripting

but it did not fail. does it fail for you?

Yes, on 64 bit Intel Ice Lake ubuntu 24.04 with current unstable and ASAN I get this:

secwall@secwall-dev:~/valkey$ git log -1 | head -n3
commit f85c93301ca63f070164b307c6a7371a9ab5ac7b
Author: Viktor Söderqvist <viktor.soderqvist@est.tech>
Date:   Mon Feb 17 13:28:45 2025 +0100
secwall@secwall-dev:~/valkey$ make SANITIZER=address -j32
...
secwall@secwall-dev:~/valkey$ ./runtest --large-memory --single unit/scripting
Cleanup: may take some time... OK
Starting test server at port 21079
[ready]: 112593
Testing unit/scripting
[ready]: 112594
[ready]: 112596
[ready]: 112597
[ready]: 112595
[ready]: 112599
[ready]: 112600
[ready]: 112602
[ready]: 112603
[ready]: 112598
[ready]: 112604
[ready]: 112601
[ready]: 112606
[ready]: 112608
[ready]: 112605
[ready]: 112607
[ok]: EVAL - Does Lua interpreter replies to our requests? (1 ms)
[ok]: EVAL - Return _G (0 ms)
[ok]: EVAL - Return table with a metatable that raise error (0 ms)
[ok]: EVAL - Return table with a metatable that call server (1 ms)
[ok]: EVAL - Lua integer -> Redis protocol type conversion (0 ms)
[ok]: EVAL - Lua string -> Redis protocol type conversion (1 ms)
[ok]: EVAL - Lua true boolean -> Redis protocol type conversion (0 ms)
[ok]: EVAL - Lua false boolean -> Redis protocol type conversion (0 ms)
[ok]: EVAL - Lua status code reply -> Redis protocol type conversion (1 ms)
[ok]: EVAL - Lua error reply -> Redis protocol type conversion (0 ms)
[ok]: EVAL - Lua table -> Redis protocol type conversion (1 ms)
[ok]: EVAL - Are the KEYS and ARGV arrays populated correctly? (0 ms)
[ok]: EVAL - is Lua able to call Redis API? (1 ms)
[ok]: EVAL - Redis integer -> Lua type conversion (1 ms)
[ok]: EVAL - Lua number -> Redis integer conversion (1 ms)
[ok]: EVAL - Redis bulk -> Lua type conversion (0 ms)
[ok]: EVAL - Redis multi bulk -> Lua type conversion (1 ms)
[ok]: EVAL - Redis status reply -> Lua type conversion (0 ms)
[ok]: EVAL - Redis error reply -> Lua type conversion (1 ms)
[ok]: EVAL - Redis nil bulk reply -> Lua type conversion (0 ms)
[ok]: EVAL - Is the Lua client using the currently selected DB? (1 ms)
[ok]: EVAL - SELECT inside Lua should not affect the caller (1 ms)
[ok]: EVAL - Scripts do not block on blpop command (1 ms)
[ok]: EVAL - Scripts do not block on brpop command (1 ms)
[ok]: EVAL - Scripts do not block on brpoplpush command (0 ms)
[ok]: EVAL - Scripts do not block on blmove command (1 ms)
[ok]: EVAL - Scripts do not block on bzpopmin command (1 ms)
[ok]: EVAL - Scripts do not block on bzpopmax command (1 ms)
[ok]: EVAL - Scripts do not block on wait (0 ms)
[ok]: EVAL - Scripts do not block on waitaof (0 ms)
[ok]: EVAL - Scripts do not block on XREAD with BLOCK option (1 ms)
[ok]: EVAL - Scripts do not block on XREADGROUP with BLOCK option (1 ms)
[ok]: EVAL - Scripts do not block on XREAD with BLOCK option -- non empty stream (2 ms)
[ok]: EVAL - Scripts do not block on XREADGROUP with BLOCK option -- non empty stream (1 ms)
[ok]: EVAL - Scripts can run non-deterministic commands (0 ms)
[ok]: EVAL - No arguments to redis.call/pcall is considered an error (1 ms)
[ok]: EVAL - redis.call variant raises a Lua error on Redis cmd error (1) (0 ms)
[ok]: EVAL - redis.call variant raises a Lua error on Redis cmd error (1) (1 ms)
[ok]: EVAL - redis.call variant raises a Lua error on Redis cmd error (1) (0 ms)
[err]: Sanitizer error: =================================================================
==112613==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x70b89bf1d8a0 at pc 0x64fc011ca6a2 bp 0x7ffd96e2dbc0 sp 0x7ffd96e2dbb0
WRITE of size 8 at 0x70b89bf1d8a0 thread T0
    #0 0x64fc011ca6a1 in memtest_test_linux_anonymous_maps /home/secwall/valkey/src/debug.c:1964
    #1 0x64fc011d792c in doFastMemoryTest /home/secwall/valkey/src/debug.c:2020
    #2 0x64fc011d792c in printCrashReport /home/secwall/valkey/src/debug.c:2224
    #3 0x64fc011e724c in sigsegvHandler /home/secwall/valkey/src/debug.c:2144
    #4 0x70b89da4532f  (/lib/x86_64-linux-gnu/libc.so.6+0x4532f) (BuildId: 42c84c92e6f98126b3e2230ebfdead22c235b667)
    #5 0x64fc01460796 in murmur32 (/home/secwall/valkey/src/valkey-server+0x607796) (BuildId: 3d0bd9226b34c65e68ca941e9533bee5e90e337b)
...
(and so on)

@secwall
Copy link
Contributor Author

secwall commented Feb 17, 2025

On the other side: test passes without error on unstable on my laptop (MacOS with M1).
This seems to be related to int size on different platforms. We get overflow on Intel, and don't get it on ARM.

Copy link
Member

@ranshid ranshid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ranshid ranshid merged commit be9d948 into valkey-io:unstable Feb 23, 2025
50 checks passed
@secwall secwall deleted the fix-murmur32 branch February 23, 2025 11:20
zuiderkwast pushed a commit that referenced this pull request Mar 18, 2025
Current murmur32 implementation fails on large strings.
Way to reproduce crash:
```
eval 'local s = string.rep("a", 1024 * 1024 * 1024) return #cjson.encode(s..s..s)' 0
```
(Basically this is a copy-paste from large test in
`tests/unit/scripting.tcl`).

Shouldn't we run tests with `--large-memory` on daily CI so we could
find this earlier?

I think we need to backport this to 8.1 branch.

Signed-off-by: secwall <secwall@yandex-team.ru>
xbasel pushed a commit to xbasel/valkey that referenced this pull request Mar 27, 2025
Current murmur32 implementation fails on large strings.
Way to reproduce crash:
```
eval 'local s = string.rep("a", 1024 * 1024 * 1024) return #cjson.encode(s..s..s)' 0
```
(Basically this is a copy-paste from large test in
`tests/unit/scripting.tcl`).

Shouldn't we run tests with `--large-memory` on daily CI so we could
find this earlier?

I think we need to backport this to 8.1 branch.

Signed-off-by: secwall <secwall@yandex-team.ru>
xbasel pushed a commit to xbasel/valkey that referenced this pull request Mar 27, 2025
Current murmur32 implementation fails on large strings.
Way to reproduce crash:
```
eval 'local s = string.rep("a", 1024 * 1024 * 1024) return #cjson.encode(s..s..s)' 0
```
(Basically this is a copy-paste from large test in
`tests/unit/scripting.tcl`).

Shouldn't we run tests with `--large-memory` on daily CI so we could
find this earlier?

I think we need to backport this to 8.1 branch.

Signed-off-by: secwall <secwall@yandex-team.ru>
@zuiderkwast zuiderkwast moved this from To be backported to Unknown if backported in Valkey 8.1 Apr 7, 2025
murphyjacob4 pushed a commit to enjoy-binbin/valkey that referenced this pull request Apr 13, 2025
Current murmur32 implementation fails on large strings.
Way to reproduce crash:
```
eval 'local s = string.rep("a", 1024 * 1024 * 1024) return #cjson.encode(s..s..s)' 0
```
(Basically this is a copy-paste from large test in
`tests/unit/scripting.tcl`).

Shouldn't we run tests with `--large-memory` on daily CI so we could
find this earlier?

I think we need to backport this to 8.1 branch.

Signed-off-by: secwall <secwall@yandex-team.ru>
@zuiderkwast zuiderkwast moved this from Unknown if backported to Done in Valkey 8.1 Apr 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants