Skip to content

Comments

Rax size tracking#688

Merged
zuiderkwast merged 29 commits intovalkey-io:unstablefrom
knggk:rax-size-tracking
Oct 2, 2024
Merged

Rax size tracking#688
zuiderkwast merged 29 commits intovalkey-io:unstablefrom
knggk:rax-size-tracking

Conversation

@knggk
Copy link
Contributor

@knggk knggk commented Jun 24, 2024

Introduce a size_t field into the rax struct to track allocation size.
Update the allocation size on rax insert and deletes.
Return the allocation size when raxAllocSize is called.

This size tracking is now used in MEMORY USAGE and MEMORY STATS in place of the previous method based on sampling.

The module API allows to create sorted dictionaries, which are backed by rax. Users now also get precise memory allocation for them (through ValkeyModule_MallocSizeDict).

Fixes #677.

For the release notes:

  • MEMORY USAGE and MEMORY STATS are now exact for streams, rather than based on sampling.

@kyle-yh-kim
Copy link
Contributor

To resolve DCO complaints, you can sign-off the previous commits.

// Sign-off the last 3 commits.
git rebase --signoff HEAD~3

// Force push to update your PR.
git push -f

yzhaon and others added 3 commits June 25, 2024 18:31
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
@kyle-yh-kim
Copy link
Contributor

Discussed with @knggk offline. A few parallelizable work for this PR;

  • Import rax-test.c and stabilize it [source].
  • Devise new test cases under rax-test.c, for size tracking.
  • MEMORY STATS integration.

@knggk knggk force-pushed the rax-size-tracking branch from 3881192 to e09bef3 Compare June 25, 2024 21:13
yzhaon and others added 6 commits June 25, 2024 21:16
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
@codecov
Copy link

codecov bot commented Jun 26, 2024

Codecov Report

Attention: Patch coverage is 75.00000% with 7 lines in your changes missing coverage. Please review.

Project coverage is 70.60%. Comparing base (9827eef) to head (fee849d).
Report is 1 commits behind head on unstable.

Files with missing lines Patch % Lines
src/object.c 0.00% 3 Missing ⚠️
src/module.c 0.00% 2 Missing ⚠️
src/rax.c 91.30% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #688      +/-   ##
============================================
+ Coverage     70.47%   70.60%   +0.13%     
============================================
  Files           114      114              
  Lines         61695    61710      +15     
============================================
+ Hits          43479    43572      +93     
+ Misses        18216    18138      -78     
Files with missing lines Coverage Δ
src/module.c 9.64% <0.00%> (+<0.01%) ⬆️
src/rax.c 82.92% <91.30%> (+0.22%) ⬆️
src/object.c 79.18% <0.00%> (+0.53%) ⬆️

... and 13 files with indirect coverage changes

Copy link
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM. The test needs to be ported to the new kind of unit tests.

Is is possible to test the calculation of the alloc size by comparing it to the stats we can get from the allocator? If we run a fuzz test and it does no other allocations, then it could work I think.

knggk added 3 commits June 27, 2024 17:34
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
@knggk
Copy link
Contributor Author

knggk commented Jun 27, 2024

In general LGTM. The test needs to be ported to the new kind of unit tests.

Is is possible to test the calculation of the alloc size by comparing it to the stats we can get from the allocator? If we run a fuzz test and it does no other allocations, then it could work I think.

Great Idea. I wanted also to check that after every element from the rax was removed, memory goes back to 0. Turns out there is an issue right now that needs further investigation, ie src/valkey-unit-tests --large-memory says:

...
[ok] - test_rax.c:test_hugeKey
...
[END] - test_rax.c: 12 tests, 12 passed, 0 failed
...
[START] - test_zmalloc.c
[test_zmallocInitialUsedMemory - unit/test_zmalloc.c:9] Failed assertion: zmalloc_used_memory() == 0

@zuiderkwast
Copy link
Contributor

zuiderkwast commented Jun 27, 2024

Just an idea: If it's a problem with zmalloc, you can create another rax_test_malloc.h to instead of rax_malloc.h just for the rax test. If the test defines RAX_MALLOC_INCLUDE before it includes rax.c, then rax.c will use it. It's already like this in rax.c:

#ifndef RAX_MALLOC_INCLUDE
#define RAX_MALLOC_INCLUDE "rax_malloc.h"
#endif

#include RAX_MALLOC_INCLUDE

knggk added 2 commits June 28, 2024 16:14
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
@knggk
Copy link
Contributor Author

knggk commented Jun 28, 2024

RAX_MALLOC_INCLUDE

Thanks for the tip @zuiderkwast . I think it wasn't a problem with zmalloc, see the latest two commits. There was a zfree missing which was the cause of valkey-unit-tests --large-memory failing.

However you gave me another idea: what if instead of manually tracking rax->alloc_size in the main code body, we made rax_alloc_size and friends do it, as in rax_alloc_size(rax, size) { void *x = zmalloc(size); rax->alloc_size += zmalloc_usable_memory(x); x } ? It sounds like it'd be harder to make mistakes like the one fixed in "Tentative fix for address sanitizer", which tried to use an address that's already been freed. What do you think?

Signed-off-by: Guillaume Koenig <knggk@amazon.com>
@knggk knggk force-pushed the rax-size-tracking branch from 237e3f9 to 581fdbe Compare July 3, 2024 21:56
knggk added 2 commits July 5, 2024 18:34
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <knggk@amazon.com>
@zuiderkwast
Copy link
Contributor

Please check clang-format and spellcheck. You can change "ba" to "by" (etc.) in the rax_test, or if this actually affects the test, it's OK to add this file to .config/typos.toml to exclude the file from spell check.

Signed-off-by: Guillaume Koenig <knggk@amazon.com>
@knggk knggk force-pushed the rax-size-tracking branch from 28951d3 to fee849d Compare October 2, 2024 16:14
@knggk
Copy link
Contributor Author

knggk commented Oct 2, 2024

Seems to be, because the build works in the CI. Did you try make distclean?

I tried with distclean but still getting the same issue on link. I am on Mac Sonoma if that's any useful info. Could is be because of missing pkg-config? I get:

% make valkey-unit-tests
make valkey-unit-tests
cd src && /Library/Developer/CommandLineTools/usr/bin/make valkey-unit-tests
/bin/sh: pkg-config: command not found
/bin/sh: pkg-config: command not found
/bin/sh: pkg-config: command not found
...

PS pushed commit to fix formatting

@zuiderkwast
Copy link
Contributor

Great, thanks.

Could is be because of missing pkg-config? I get:

Possibly. I'm running Fedora on my macbook. It seemed better for programmers. :D

pkg-config is used in the Makefile and maybe it just exits if it's not available. Why don't you have it? And why don't you have make in your path? Maybe you just need some basic "build util" package from brew or something...?

@zuiderkwast zuiderkwast added the release-notes This issue should get a line item in the release notes label Oct 2, 2024
@zuiderkwast zuiderkwast merged commit f85d8bf into valkey-io:unstable Oct 2, 2024
@zuiderkwast
Copy link
Contributor

zuiderkwast commented Oct 3, 2024

@knggk There was a build failure with undefined-santitizer in Daily:

...
Cluster Fuzz test [keys:99963652 keylen:0]: ok with 71243660 final keys
Cluster Fuzz test [keys:37437816 keylen:0]: ok with 34387942 final keys
Cluster Fuzz test [keys:22357214 keylen:0]: ok with 5758528 final keys
Fuzz test in mode 0 [9067]: 8396 elements inserted
unit/test_rax.c:168:15: runtime error: left shift of 36625 by 16 places cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior unit/test_rax.c:168:15 in 
Fuzz test in mode 1 [7504]: 

Do you want to look into it? You need to build with SANITIZER=undefined and run the tests with --accurate.

I have a fix for it in #1122.

@knggk
Copy link
Contributor Author

knggk commented Oct 3, 2024

Thanks Viktor for the turn around, I hadn't seen the issue.

zuiderkwast added a commit that referenced this pull request Oct 3, 2024
Fix the warning introduced in #688:

```
unit/test_rax.c:168:15: runtime error: left shift of 36625 by 16 places cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior unit/test_rax.c:168:15 in 
Fuzz test in mode 1 [7504]: 
```

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
naglera pushed a commit to naglera/placeholderkv that referenced this pull request Oct 10, 2024
Introduce a `size_t` field into the rax struct to track allocation size.
Update the allocation size on rax insert and deletes.
Return the allocation size when `raxAllocSize` is called.

This size tracking is now used in MEMORY USAGE and MEMORY STATS in place
of the previous method based on sampling.

The module API allows to create sorted dictionaries, which are backed by
rax. Users now also get precise memory allocation for them (through
`ValkeyModule_MallocSizeDict`).

Fixes valkey-io#677.

For the release notes:

* MEMORY USAGE and MEMORY STATS are now exact for streams, rather than
based on sampling.

---------

Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <106696198+knggk@users.noreply.github.com>
Co-authored-by: Joey <yzhaon@amazon.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Signed-off-by: naglera <anagler123@gmail.com>
naglera pushed a commit to naglera/placeholderkv that referenced this pull request Oct 10, 2024
Fix the warning introduced in valkey-io#688:

```
unit/test_rax.c:168:15: runtime error: left shift of 36625 by 16 places cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior unit/test_rax.c:168:15 in
Fuzz test in mode 1 [7504]:
```

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Signed-off-by: naglera <anagler123@gmail.com>
rainsupreme pushed a commit to rainsupreme/valkey that referenced this pull request Oct 11, 2024
Introduce a `size_t` field into the rax struct to track allocation size.
Update the allocation size on rax insert and deletes.
Return the allocation size when `raxAllocSize` is called.

This size tracking is now used in MEMORY USAGE and MEMORY STATS in place
of the previous method based on sampling.

The module API allows to create sorted dictionaries, which are backed by
rax. Users now also get precise memory allocation for them (through
`ValkeyModule_MallocSizeDict`).

Fixes valkey-io#677.

For the release notes:

* MEMORY USAGE and MEMORY STATS are now exact for streams, rather than
based on sampling.

---------

Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <106696198+knggk@users.noreply.github.com>
Co-authored-by: Joey <yzhaon@amazon.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
rainsupreme pushed a commit to rainsupreme/valkey that referenced this pull request Oct 11, 2024
Fix the warning introduced in valkey-io#688:

```
unit/test_rax.c:168:15: runtime error: left shift of 36625 by 16 places cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior unit/test_rax.c:168:15 in 
Fuzz test in mode 1 [7504]: 
```

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
eifrah-aws pushed a commit to eifrah-aws/valkey that referenced this pull request Oct 20, 2024
Introduce a `size_t` field into the rax struct to track allocation size.
Update the allocation size on rax insert and deletes.
Return the allocation size when `raxAllocSize` is called.

This size tracking is now used in MEMORY USAGE and MEMORY STATS in place
of the previous method based on sampling.

The module API allows to create sorted dictionaries, which are backed by
rax. Users now also get precise memory allocation for them (through
`ValkeyModule_MallocSizeDict`).

Fixes valkey-io#677.

For the release notes:

* MEMORY USAGE and MEMORY STATS are now exact for streams, rather than
based on sampling.

---------

Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <106696198+knggk@users.noreply.github.com>
Co-authored-by: Joey <yzhaon@amazon.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
eifrah-aws pushed a commit to eifrah-aws/valkey that referenced this pull request Oct 20, 2024
Fix the warning introduced in valkey-io#688:

```
unit/test_rax.c:168:15: runtime error: left shift of 36625 by 16 places cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior unit/test_rax.c:168:15 in 
Fuzz test in mode 1 [7504]: 
```

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
vitarb pushed a commit to vitarb/valkey that referenced this pull request Jun 24, 2025
Fix the warning introduced in valkey-io#688:

```
unit/test_rax.c:168:15: runtime error: left shift of 36625 by 16 places cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior unit/test_rax.c:168:15 in
Fuzz test in mode 1 [7504]:
```

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
@sarthakaggarwal97 sarthakaggarwal97 moved this to To be backported in Valkey 8.0 Nov 6, 2025
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Jan 29, 2026
Introduce a `size_t` field into the rax struct to track allocation size.
Update the allocation size on rax insert and deletes.
Return the allocation size when `raxAllocSize` is called.

This size tracking is now used in MEMORY USAGE and MEMORY STATS in place
of the previous method based on sampling.

The module API allows to create sorted dictionaries, which are backed by
rax. Users now also get precise memory allocation for them (through
`ValkeyModule_MallocSizeDict`).

Fixes valkey-io#677.

For the release notes:

* MEMORY USAGE and MEMORY STATS are now exact for streams, rather than
based on sampling.

---------

Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <106696198+knggk@users.noreply.github.com>
Co-authored-by: Joey <yzhaon@amazon.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Jan 29, 2026
Introduce a `size_t` field into the rax struct to track allocation size.
Update the allocation size on rax insert and deletes.
Return the allocation size when `raxAllocSize` is called.

This size tracking is now used in MEMORY USAGE and MEMORY STATS in place
of the previous method based on sampling.

The module API allows to create sorted dictionaries, which are backed by
rax. Users now also get precise memory allocation for them (through
`ValkeyModule_MallocSizeDict`).

Fixes valkey-io#677.

For the release notes:

* MEMORY USAGE and MEMORY STATS are now exact for streams, rather than
based on sampling.

---------

Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <106696198+knggk@users.noreply.github.com>
Co-authored-by: Joey <yzhaon@amazon.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Jan 30, 2026
Introduce a `size_t` field into the rax struct to track allocation size.
Update the allocation size on rax insert and deletes.
Return the allocation size when `raxAllocSize` is called.

This size tracking is now used in MEMORY USAGE and MEMORY STATS in place
of the previous method based on sampling.

The module API allows to create sorted dictionaries, which are backed by
rax. Users now also get precise memory allocation for them (through
`ValkeyModule_MallocSizeDict`).

Fixes valkey-io#677.

For the release notes:

* MEMORY USAGE and MEMORY STATS are now exact for streams, rather than
based on sampling.

---------

Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <106696198+knggk@users.noreply.github.com>
Co-authored-by: Joey <yzhaon@amazon.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
@roshkhatri roshkhatri moved this from To be backported to 8.0.7 in Valkey 8.0 Jan 30, 2026
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Feb 4, 2026
Introduce a `size_t` field into the rax struct to track allocation size.
Update the allocation size on rax insert and deletes.
Return the allocation size when `raxAllocSize` is called.

This size tracking is now used in MEMORY USAGE and MEMORY STATS in place
of the previous method based on sampling.

The module API allows to create sorted dictionaries, which are backed by
rax. Users now also get precise memory allocation for them (through
`ValkeyModule_MallocSizeDict`).

Fixes valkey-io#677.

For the release notes:

* MEMORY USAGE and MEMORY STATS are now exact for streams, rather than
based on sampling.

---------

Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <106696198+knggk@users.noreply.github.com>
Co-authored-by: Joey <yzhaon@amazon.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Feb 18, 2026
Introduce a `size_t` field into the rax struct to track allocation size.
Update the allocation size on rax insert and deletes.
Return the allocation size when `raxAllocSize` is called.

This size tracking is now used in MEMORY USAGE and MEMORY STATS in place
of the previous method based on sampling.

The module API allows to create sorted dictionaries, which are backed by
rax. Users now also get precise memory allocation for them (through
`ValkeyModule_MallocSizeDict`).

Fixes valkey-io#677.

For the release notes:

* MEMORY USAGE and MEMORY STATS are now exact for streams, rather than
based on sampling.

---------

Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <106696198+knggk@users.noreply.github.com>
Co-authored-by: Joey <yzhaon@amazon.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
@zuiderkwast zuiderkwast removed this from Valkey 8.0 Feb 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes This issue should get a line item in the release notes

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Introduce new rax function raxAllocSize to return rax tree allocation size in constant time

6 participants