Skip to content

Comments

Refactor scanLaterList to fix latency issue#1

Closed
asagege wants to merge 1 commit intounstablefrom
latencyTest
Closed

Refactor scanLaterList to fix latency issue#1
asagege wants to merge 1 commit intounstablefrom
latencyTest

Conversation

@asagege
Copy link
Owner

@asagege asagege commented Aug 19, 2025

This fix only addresses the big list test.

Refactor scanLaterList to fix latency issue:

  1. Remove the iterations stuff and check the time every loop - The iteration optimization should be removed because the time check (19ns) is negligible compared to the actual defragmentation work (even a single 8-byte reallocation takes 43ns and the allocatorShouldDefrag function call takes 49ns per block), so the time check cost at most 30% and likely much less in practice.
  2. Remove bookmark_failed - Simplify control flow by handling bookmark failure inline.
  3. Move (*cursor)++ inside time check - We only increment cursor when we actually need to resume.
  4. Refactor bookmark failure handling: when bookmark creation failed, break out of the loop, and return with 0 - Skip this one and continue with other quicklists.

Introduced in valkey-io#2444

@asagege asagege closed this Aug 20, 2025
@asagege asagege deleted the latencyTest branch August 26, 2025 18:35
asagege pushed a commit that referenced this pull request Nov 5, 2025
With valkey-io#1401, we introduced additional filters to CLIENT LIST/KILL
subcommand. The intended behavior was to pick the last value of the
filter. However, we introduced memory leak for all the preceding
filters.

Before this change:
```
> CLIENT LIST IP 127.0.0.1 IP 127.0.0.1
id=4 addr=127.0.0.1:37866 laddr=127.0.0.1:6379 fd=10 name= age=0 idle=0 flags=N capa= db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=21 multi-mem=0 rbs=16384 rbp=16384 obl=0 oll=0 omem=0 tot-mem=16989 events=r cmd=client|list user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=49 tot-net-out=0 tot-cmds=0
```
Leak:
```
Direct leak of 11 byte(s) in 1 object(s) allocated from:
    #0 0x7f2901aa557d in malloc (/lib64/libasan.so.4+0xd857d)
    #1 0x76db76 in ztrymalloc_usable_internal /workplace/harkrisp/valkey/src/zmalloc.c:156
    #2 0x76db76 in zmalloc_usable /workplace/harkrisp/valkey/src/zmalloc.c:200
    #3 0x4c4121 in _sdsnewlen.constprop.230 /workplace/harkrisp/valkey/src/sds.c:113
    #4 0x4dc456 in parseClientFiltersOrReply.constprop.63 /workplace/harkrisp/valkey/src/networking.c:4264
    #5 0x4bb9f7 in clientListCommand /workplace/harkrisp/valkey/src/networking.c:4600
    valkey-io#6 0x641159 in call /workplace/harkrisp/valkey/src/server.c:3772
    valkey-io#7 0x6431a6 in processCommand /workplace/harkrisp/valkey/src/server.c:4434
    valkey-io#8 0x4bfa9b in processCommandAndResetClient /workplace/harkrisp/valkey/src/networking.c:3571
    valkey-io#9 0x4bfa9b in processInputBuffer /workplace/harkrisp/valkey/src/networking.c:3702
    valkey-io#10 0x4bffa3 in readQueryFromClient /workplace/harkrisp/valkey/src/networking.c:3812
    valkey-io#11 0x481015 in callHandler /workplace/harkrisp/valkey/src/connhelpers.h:79
    valkey-io#12 0x481015 in connSocketEventHandler.lto_priv.394 /workplace/harkrisp/valkey/src/socket.c:301
    valkey-io#13 0x7d3fb3 in aeProcessEvents /workplace/harkrisp/valkey/src/ae.c:486
    valkey-io#14 0x7d4d44 in aeMain /workplace/harkrisp/valkey/src/ae.c:543
    valkey-io#15 0x453925 in main /workplace/harkrisp/valkey/src/server.c:7319
    valkey-io#16 0x7f2900cd7139 in __libc_start_main (/lib64/libc.so.6+0x21139)
```

Note: For filter ID / NOT-ID we group all the option and perform
filtering whereas for remaining filters we only pick the last filter
option.

---------

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
asagege pushed a commit that referenced this pull request Feb 10, 2026
Address valkey-io#2683 

For the test, I found two problems.

### 1. Test Assumes Winner Has Rank #0
In my test run, in the failing case:
- Replica -3 (rank 0) won epoch 10 first
- Replica -6 (rank 1) won epoch 11 second
- Replica -3 saw higher epoch and stepped down
- Final result: rank 1 became master

Rank #0 doesn't guarantee being the final master.

### 2. Pattern Matching Bug
The pattern `*Start of election*rank #0*` incorrectly matches "primary
rank #0":

Log: `Start of election delayed for 350 milliseconds (rank #1, primary
rank #0, offset 2172)`

This line has rank `#1`, but the pattern matches because of "primary
rank `#0`" at the end.

## Solution
- Fixed the format problem by checking if `(rank #0` or `(rank #1` so
that we won't accidentally match the primary rank
- Only check to make sure replica 3 and replica 6 have different ranks
without assuming the replica with rank 0 will become the master.

Signed-off-by: Hanxi Zhang <hanxizh@amazon.com>
asagege pushed a commit that referenced this pull request Feb 10, 2026
…lkey-io#3174)

I was working on ASAN large memory tests when I countered this issue.

The issue was that the hardcoded `999` key could land in an early
bucket. Then shrink rehash could finish early, and later inserts could
trigger a new expansion rehash, resetting rehash_idx low. The test now
picks the survivor key dynamically as the key mapped to the highest
bucket index.

```
[test_hashtable.c] Memory leak detected of 336 bytes
=================================================================
==3901==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 80 byte(s) in 1 object(s) allocated from:
    #0 0x7fb0556fd9c7 in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x563bfdf4c47d in ztrymalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:156
    #2 0x563bfdf4c47d in valkey_malloc /home/runner/work/valkey/valkey/src/zmalloc.c:185
    #3 0x563bfdd42eaf in hashtableCreate /home/runner/work/valkey/valkey/src/hashtable.c:1217
    #4 0x563bfdaa1cbf in test_empty_buckets_rehashing unit/test_hashtable.c:232
    #5 0x563bfdae772b in runTestSuite unit/test_main.c:36
    valkey-io#6 0x563bfda86b20 in main unit/test_main.c:108
    valkey-io#7 0x7fb05522a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    valkey-io#8 0x7fb05522a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    valkey-io#9 0x563bfda8a5c4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x17c5c4) (BuildId: 44cfc183e6e82e499bcc9f6adc094d7f774ee9d2)

Indirect leak of 128 byte(s) in 1 object(s) allocated from:
    #0 0x7fb0556fd340 in calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    #1 0x563bfdf4c922 in ztrycalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:214
    #2 0x563bfdf4c922 in valkey_calloc /home/runner/work/valkey/valkey/src/zmalloc.c:257
    #3 0x563bfdd40967 in resize /home/runner/work/valkey/valkey/src/hashtable.c:741
    #4 0x563bfdd45eb1 in hashtableExpandIfNeeded /home/runner/work/valkey/valkey/src/hashtable.c:1446
    #5 0x563bfdd45eb1 in hashtableExpandIfNeeded /home/runner/work/valkey/valkey/src/hashtable.c:1433
    valkey-io#6 0x563bfdd45eb1 in insert /home/runner/work/valkey/valkey/src/hashtable.c:1041
    valkey-io#7 0x563bfdd45eb1 in hashtableAddOrFind /home/runner/work/valkey/valkey/src/hashtable.c:1554
    valkey-io#8 0x563bfdd45eb1 in hashtableAdd /home/runner/work/valkey/valkey/src/hashtable.c:1539
    valkey-io#9 0x563bfdaa1e3b in test_empty_buckets_rehashing unit/test_hashtable.c:254
    valkey-io#10 0x563bfdae772b in runTestSuite unit/test_main.c:36
    valkey-io#11 0x563bfda86b20 in main unit/test_main.c:108
    valkey-io#12 0x7fb05522a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    valkey-io#13 0x7fb05522a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    valkey-io#14 0x563bfda8a5c4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x17c5c4) (BuildId: 44cfc183e6e82e499bcc9f6adc094d7f774ee9d2)

Indirect leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7fb0556fd340 in calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    #1 0x563bfdf4c922 in ztrycalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:214
    #2 0x563bfdf4c922 in valkey_calloc /home/runner/work/valkey/valkey/src/zmalloc.c:257
    #3 0x563bfdd3f553 in bucketConvertToChained /home/runner/work/valkey/valkey/src/hashtable.c:908
    #4 0x563bfdd3f553 in findBucketForInsert /home/runner/work/valkey/valkey/src/hashtable.c:1021
    #5 0x563bfdd45d9e in insert /home/runner/work/valkey/valkey/src/hashtable.c:1045
    valkey-io#6 0x563bfdd45d9e in hashtableAddOrFind /home/runner/work/valkey/valkey/src/hashtable.c:1554
    valkey-io#7 0x563bfdd45d9e in hashtableAdd /home/runner/work/valkey/valkey/src/hashtable.c:1539
    valkey-io#8 0x563bfdaa1e3b in test_empty_buckets_rehashing unit/test_hashtable.c:254
    valkey-io#9 0x563bfdae772b in runTestSuite unit/test_main.c:36
    valkey-io#10 0x563bfda86b20 in main unit/test_main.c:108
    valkey-io#11 0x7fb05522a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    valkey-io#12 0x7fb05522a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    valkey-io#13 0x563bfda8a5c4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x17c5c4) (BuildId: 44cfc183e6e82e499bcc9f6adc094d7f774ee9d2)

Indirect leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7fb0556fd340 in calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    #1 0x563bfdf4c922 in ztrycalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:214
    #2 0x563bfdf4c922 in valkey_calloc /home/runner/work/valkey/valkey/src/zmalloc.c:257
    #3 0x563bfdd40967 in resize /home/runner/work/valkey/valkey/src/hashtable.c:741
    #4 0x563bfdaa1df8 in test_empty_buckets_rehashing unit/test_hashtable.c:248
    #5 0x563bfdae772b in runTestSuite unit/test_main.c:36
    valkey-io#6 0x563bfda86b20 in main unit/test_main.c:108
    valkey-io#7 0x7fb05522a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    valkey-io#8 0x7fb05522a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    valkey-io#9 0x563bfda8a5c4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x17c5c4) (BuildId: 44cfc183e6e82e499bcc9f6adc094d7f774ee9d2)

SUMMARY: AddressSanitizer: 336 byte(s) leaked in 4 allocation(s).
```

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant