Skip to content

Comments

Fix used_memory_dataset underflow due to miscalculated used_memory_overhead#3005

Merged
enjoy-binbin merged 6 commits intovalkey-io:unstablefrom
bpint:unstable
Jan 15, 2026
Merged

Fix used_memory_dataset underflow due to miscalculated used_memory_overhead#3005
enjoy-binbin merged 6 commits intovalkey-io:unstablefrom
bpint:unstable

Conversation

@bpint
Copy link
Contributor

@bpint bpint commented Jan 5, 2026

Describe the bug

The metric used_memory_dataset turned into an insanely large number close to 2^64 (actually overflowed negative value),
as reported in #2994 (comment).

Versions affected include 8.x and 9.x. (Should i submit multiple PRs?)

Analysis

We found that miscalculated used_memory_overhead is the core reason.

Double-Counted database memory

When server starts, the global variable server.initial_memory_usage is used to record a memory baseline in InitServerLast(). This server.initial_memory_usage has clearly included initial database memory, since databases are created in initServer().

In function getMemoryOverheadData(), the mem_total is firstly assigned the baseline, which includes initial database memory. And then all extra memory usage of databases are added to mem_total. The initial database memory are therefore counted TWICE.

This eventually caused wrongly larger used_memory_overhead. For a database with only a couple of keys, the used_memory_overhead is easily larger than used_memory and causes an overflowed used_memory_dataset.

Missed Empty Databases

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation,

if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(), including hashtable_size_index, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller used_memory_overhead for an empty database. When we insert only ONE key to the database, the database is suddenly taken into account, and used_memory_overhead will increase (for used_memory_dataset decrease) by more than 128 KiB due to the single key insertion.

Proposed Fix

In this PR,

a) We record the zmalloc_used_memory() change before and after database creation during server initialization. And later when we decide the baseline server.initial_memory_usage, the initial database memory usage is excluded.

b) We remove the condition kvstoreNumAllocatedHashtables() in getMemoryOverheadData(), to make the used_memory_overhead reasonable for both empty and non-empty kvstores.

@codecov
Copy link

codecov bot commented Jan 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.26%. Comparing base (e4a3e9f) to head (bc97c82).
⚠️ Report is 21 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3005      +/-   ##
============================================
- Coverage     74.34%   74.26%   -0.09%     
============================================
  Files           129      129              
  Lines         70908    70914       +6     
============================================
- Hits          52714    52661      -53     
- Misses        18194    18253      +59     
Files with missing lines Coverage Δ
src/object.c 91.52% <100.00%> (ø)
src/server.c 89.44% <100.00%> (+0.01%) ⬆️

... and 23 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@madolson
Copy link
Member

madolson commented Jan 5, 2026

Versions affected include 8.x and 9.x. (Should i submit multiple PRs?)

We'll backport it as needed after the main PR is submitted.

bpint added 3 commits January 5, 2026 15:20
…overhead

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Corrected spelling of `necessary` to `necessary` in comments.

Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Copy link
Member

@enjoy-binbin enjoy-binbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good catch.

bpint and others added 2 commits January 5, 2026 18:12
Co-authored-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: bpint <chemistudio@gmail.com>
@bpint bpint requested a review from enjoy-binbin January 5, 2026 10:17
@enjoy-binbin enjoy-binbin moved this to To be backported in Valkey 8.0 Jan 5, 2026
@enjoy-binbin enjoy-binbin moved this to To be backported in Valkey 8.1 Jan 5, 2026
@enjoy-binbin enjoy-binbin moved this to To be backported in Valkey 9.0 Jan 5, 2026
@enjoy-binbin enjoy-binbin added the release-notes This issue should get a line item in the release notes label Jan 5, 2026
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Copy link
Member

@enjoy-binbin enjoy-binbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you able to write a TCL test case to cover this? If not, i can take a look later.

@bpint
Copy link
Contributor Author

bpint commented Jan 6, 2026

Would you able to write a TCL test case to cover this? If not, i can take a look later.

really sorry that i am familiar with neither tcl language nor the principle of test cases.

@enjoy-binbin enjoy-binbin merged commit 2635377 into valkey-io:unstable Jan 15, 2026
24 checks passed
@enjoy-binbin
Copy link
Member

@bpint Merged, thank you.

@bpint
Copy link
Contributor Author

bpint commented Jan 15, 2026

respect to @madolson @enjoy-binbin and all maintainers for your efficient and awesome work

arshidkv12 pushed a commit to arshidkv12/valkey that referenced this pull request Jan 23, 2026
…erhead (valkey-io#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in valkey-io#2994.

## Double-Counted database memory

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

## Missed Empty Databases

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: arshidkv12 <arshidkv12@gmail.com>
zuiderkwast pushed a commit to zuiderkwast/placeholderkv that referenced this pull request Jan 29, 2026
…erhead (valkey-io#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in valkey-io#2994.

## Double-Counted database memory

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

## Missed Empty Databases

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Jan 29, 2026
…erhead (valkey-io#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in valkey-io#2994.

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Jan 29, 2026
…erhead (valkey-io#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in valkey-io#2994.

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
@roshkhatri roshkhatri moved this from To be backported to 8.1.6 in Valkey 8.1 Jan 29, 2026
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Jan 29, 2026
…erhead (valkey-io#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in valkey-io#2994.

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Jan 29, 2026
…erhead (valkey-io#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in valkey-io#2994.

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Jan 29, 2026
…erhead (valkey-io#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in valkey-io#2994.

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Jan 30, 2026
…erhead (valkey-io#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in valkey-io#2994.

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
@roshkhatri roshkhatri moved this from To be backported to 8.0.7 in Valkey 8.0 Jan 30, 2026
zuiderkwast pushed a commit to zuiderkwast/placeholderkv that referenced this pull request Jan 30, 2026
…erhead (valkey-io#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in valkey-io#2994.

## Double-Counted database memory

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

## Missed Empty Databases

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
@zuiderkwast zuiderkwast moved this from To be backported to 9.0.2 WIP in Valkey 9.0 Jan 30, 2026
zuiderkwast pushed a commit that referenced this pull request Feb 3, 2026
…erhead (#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in #2994.

## Double-Counted database memory

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

## Missed Empty Databases

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Feb 4, 2026
…erhead (valkey-io#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in valkey-io#2994.

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Feb 18, 2026
…erhead (valkey-io#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in valkey-io#2994.

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
harrylin98 pushed a commit to harrylin98/valkey_forked that referenced this pull request Feb 19, 2026
…erhead (valkey-io#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in valkey-io#2994.

## Double-Counted database memory

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

## Missed Empty Databases

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
roshkhatri pushed a commit to roshkhatri/valkey that referenced this pull request Feb 20, 2026
…erhead (valkey-io#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in valkey-io#2994.

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
madolson added a commit that referenced this pull request Feb 24, 2026
…erhead (#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in #2994.

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
madolson added a commit that referenced this pull request Feb 24, 2026
…erhead (#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in #2994.

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes This issue should get a line item in the release notes

Projects

Status: 8.0.7 (WIP)
Status: 8.1.6 WIP
Status: 9.0.2

Development

Successfully merging this pull request may close these issues.

[BUG] Serious and Curious Memory Statistic Problem of used_memory and used_memory_dataset

3 participants