Skip to content

Comments

Fix how hash is handling overriding of expired fields overwrite#3060

Merged
ranshid merged 13 commits intovalkey-io:unstablefrom
ranshid:fix-propagate-hdel-when-override-expired-element
Jan 26, 2026
Merged

Fix how hash is handling overriding of expired fields overwrite#3060
ranshid merged 13 commits intovalkey-io:unstablefrom
ranshid:fix-propagate-hdel-when-override-expired-element

Conversation

@ranshid
Copy link
Member

@ranshid ranshid commented Jan 13, 2026

There are currently several issues with the existing hash field expiration mechanism:

  1. HINCRBY is propagated to the replica "as-is". It mean it relies on the fact that the state of the hash is the same on the primary and the replica. HFE did change this assumption as the field might be expired only when the replica will handle the propagated hincrby. the problem is that the replica does not "expire" fields by it's own. it needs to respect the request from the primary and always try to use the existing field. This can lead to either miss-alignment with the value on the primary and the replica AND even a disconnection since the replica might hold and "expired" field which is not in "integer" format...
  2. HINCRBYFLOAT is currently ALWAYS propagating hset - this means that the expiration time of an entry will always be removed on the replica side (it needs to propagate HSETEX when expiration time needs to be maintained)
  3. Currently all hash write commands which are mutating values might overwrite an expired field. In such cases the existing implementation will "silently" do so. The problem is that the user will not get any key-space-notificaiton explaining the reason for the behavior. For example, when hincrby is issued overwriting an expired field which was not yet "cleaned" by active-expiration it will reset the counter to '0' before incrementing it. this means that the user might ask: why is the value '1' and not bigger, "I did not see any notification that the old value expired"...
  4. HSETEX with KEEPTTL suffers from a "somewhat" similar problem as #(1). the replica will receive the propagated command, but will not know if the primary "replaced" the entry which is expired now but might not have been expired when the primary applied it.

There are 2 options for a solution:

  1. we could propagate hdel for every entry we are "overwritting" (batch them if we can)
  2. propagate the commands "by effect". For example - have hincrby always propagate either HSET or HSETEX. This will not solve the '#(4)' problem above though, for which we might HAVE to propagate hdel

I tend to go with the second option. The reason is that it is expected to have less impact on replication stream and should include less processing time on the replicas and network traffic. Specifically for HSETEX with KEEPTTL we will have to propagate the hdel in case we overwritten an expired field, but that would help limit the impact of this propagation.

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
@codecov
Copy link

codecov bot commented Jan 13, 2026

Codecov Report

❌ Patch coverage is 96.90722% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.38%. Comparing base (f0d9810) to head (0ea52ac).
⚠️ Report is 30 commits behind head on unstable.

Files with missing lines Patch % Lines
src/t_hash.c 97.82% 2 Missing ⚠️
src/module.c 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3060      +/-   ##
============================================
+ Coverage     74.25%   74.38%   +0.13%     
============================================
  Files           129      129              
  Lines         70988    71124     +136     
============================================
+ Hits          52712    52907     +195     
+ Misses        18276    18217      -59     
Files with missing lines Coverage Δ
src/db.c 94.33% <100.00%> (+0.15%) ⬆️
src/server.c 89.57% <100.00%> (+0.11%) ⬆️
src/server.h 100.00% <ø> (ø)
src/module.c 26.50% <0.00%> (-0.01%) ⬇️
src/t_hash.c 95.38% <97.82%> (+0.46%) ⬆️

... and 28 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
@ranshid ranshid changed the title propagate hdel when primary overrides expired hash field Fix how hash is handling overriding of expired fields overwrite Jan 14, 2026
@cjx-zar
Copy link
Contributor

cjx-zar commented Jan 14, 2026

Since we have import-mode, I think this solution is feasible.

…cation

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
@ranshid ranshid added bug Something isn't working release-notes This issue should get a line item in the release notes labels Jan 15, 2026
@ranshid
Copy link
Member Author

ranshid commented Jan 15, 2026

@cjx-zar I updated the PR with some relevant tests. @frostzt is working to add the hdel propagation when the primary overwrites an expired field/s (which is why the relevant test should fail now). would be happy if you could TAL

frostzt and others added 2 commits January 15, 2026 12:55
…ided

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
@ranshid ranshid force-pushed the fix-propagate-hdel-when-override-expired-element branch from 7faa9a2 to 6af3175 Compare January 15, 2026 11:14
@ranshid ranshid marked this pull request as ready for review January 15, 2026 11:26
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
@ranshid ranshid requested a review from enjoy-binbin January 15, 2026 12:00
@cjx-zar
Copy link
Contributor

cjx-zar commented Jan 16, 2026

Sorry, I'm busy with other things today. I'll check tomorrow.

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
@madolson madolson requested a review from murphyjacob4 January 19, 2026 17:27
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Copy link
Contributor

@cjx-zar cjx-zar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM:)

Copy link
Member

@enjoy-binbin enjoy-binbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Top comment LGTM, i scanned the code (but didn't do a very thorough review), and i trust your changes in HFE part.

ranshid and others added 2 commits January 25, 2026 09:12
Co-authored-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
@ranshid ranshid merged commit 94edae4 into valkey-io:unstable Jan 26, 2026
24 checks passed
@github-project-automation github-project-automation bot moved this to To be backported in Valkey 9.0 Jan 26, 2026
ranshid added a commit to ranshid/valkey that referenced this pull request Jan 26, 2026
…ey-io#3060)

There are currently several issues with the existing hash field
expiration mechanism:
1. `HINCRBY` is propagated to the replica "as-is". It mean it relies on
the fact that the state of the hash is the same on the primary and the
replica. HFE did change this assumption as the field might be expired
only when the replica will handle the propagated `hincrby`. the problem
is that the replica does not "expire" fields by it's own. it needs to
respect the request from the primary and always try to use the existing
field. This can lead to either miss-alignment with the value on the
primary and the replica AND even a disconnection since the replica might
hold and "expired" field which is not in "integer" format...
2. HINCRBYFLOAT is currently ALWAYS propagating `hset` - this means that
the expiration time of an entry will always be removed on the replica
side (it needs to propagate HSETEX when expiration time needs to be
maintained)
3. Currently all hash write commands which are mutating values might
overwrite an expired field. In such cases the existing implementation
will "silently" do so. The problem is that the user will not get any
key-space-notificaiton explaining the reason for the behavior. For
example, when `hincrby` is issued overwriting an expired field which was
not yet "cleaned" by active-expiration it will reset the counter to '0'
before incrementing it. this means that the user might ask: why is the
value '1' and not bigger, "I did not see any notification that the old
value expired"...
4. HSETEX with KEEPTTL suffers from a "somewhat" similar problem as
if the primary "replaced" the entry which is expired now but might not
have been expired when the primary applied it.

There are 2 options for a solution:
1. we could propagate `hdel` for every entry we are "overwritting"
(batch them if we can)
2. propagate the commands "by effect". For example - have `hincrby`
always propagate either HSET or HSETEX. This will not solve the '#(4)'
problem above though, for which we might HAVE to propagate `hdel`

I tend to go with the second option. The reason is that it is expected
to have less impact on replication stream and should include less
processing time on the replicas and network traffic. Specifically for
HSETEX with KEEPTTL we will have to propagate the `hdel` in case we
overwritten an expired field, but that would help limit the impact of
this propagation.

---------

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Co-authored-by: Sourav Singh Rawat <aidenfrostbite@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
@zuiderkwast zuiderkwast moved this from To be backported to 9.0.2 WIP in Valkey 9.0 Jan 28, 2026
ranshid added a commit to ranshid/valkey that referenced this pull request Jan 28, 2026
…ey-io#3060)

There are currently several issues with the existing hash field
expiration mechanism:
1. `HINCRBY` is propagated to the replica "as-is". It mean it relies on
the fact that the state of the hash is the same on the primary and the
replica. HFE did change this assumption as the field might be expired
only when the replica will handle the propagated `hincrby`. the problem
is that the replica does not "expire" fields by it's own. it needs to
respect the request from the primary and always try to use the existing
field. This can lead to either miss-alignment with the value on the
primary and the replica AND even a disconnection since the replica might
hold and "expired" field which is not in "integer" format...
2. HINCRBYFLOAT is currently ALWAYS propagating `hset` - this means that
the expiration time of an entry will always be removed on the replica
side (it needs to propagate HSETEX when expiration time needs to be
maintained)
3. Currently all hash write commands which are mutating values might
overwrite an expired field. In such cases the existing implementation
will "silently" do so. The problem is that the user will not get any
key-space-notificaiton explaining the reason for the behavior. For
example, when `hincrby` is issued overwriting an expired field which was
not yet "cleaned" by active-expiration it will reset the counter to '0'
before incrementing it. this means that the user might ask: why is the
value '1' and not bigger, "I did not see any notification that the old
value expired"...
4. HSETEX with KEEPTTL suffers from a "somewhat" similar problem as
if the primary "replaced" the entry which is expired now but might not
have been expired when the primary applied it.

There are 2 options for a solution:
1. we could propagate `hdel` for every entry we are "overwritting"
(batch them if we can)
2. propagate the commands "by effect". For example - have `hincrby`
always propagate either HSET or HSETEX. This will not solve the '#(4)'
problem above though, for which we might HAVE to propagate `hdel`

I tend to go with the second option. The reason is that it is expected
to have less impact on replication stream and should include less
processing time on the replicas and network traffic. Specifically for
HSETEX with KEEPTTL we will have to propagate the `hdel` in case we
overwritten an expired field, but that would help limit the impact of
this propagation.

---------

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Co-authored-by: Sourav Singh Rawat <aidenfrostbite@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
ranshid added a commit to ranshid/valkey that referenced this pull request Jan 28, 2026
…ey-io#3060)

There are currently several issues with the existing hash field
expiration mechanism:
1. `HINCRBY` is propagated to the replica "as-is". It mean it relies on
the fact that the state of the hash is the same on the primary and the
replica. HFE did change this assumption as the field might be expired
only when the replica will handle the propagated `hincrby`. the problem
is that the replica does not "expire" fields by it's own. it needs to
respect the request from the primary and always try to use the existing
field. This can lead to either miss-alignment with the value on the
primary and the replica AND even a disconnection since the replica might
hold and "expired" field which is not in "integer" format...
2. HINCRBYFLOAT is currently ALWAYS propagating `hset` - this means that
the expiration time of an entry will always be removed on the replica
side (it needs to propagate HSETEX when expiration time needs to be
maintained)
3. Currently all hash write commands which are mutating values might
overwrite an expired field. In such cases the existing implementation
will "silently" do so. The problem is that the user will not get any
key-space-notificaiton explaining the reason for the behavior. For
example, when `hincrby` is issued overwriting an expired field which was
not yet "cleaned" by active-expiration it will reset the counter to '0'
before incrementing it. this means that the user might ask: why is the
value '1' and not bigger, "I did not see any notification that the old
value expired"...
4. HSETEX with KEEPTTL suffers from a "somewhat" similar problem as
if the primary "replaced" the entry which is expired now but might not
have been expired when the primary applied it.

There are 2 options for a solution:
1. we could propagate `hdel` for every entry we are "overwritting"
(batch them if we can)
2. propagate the commands "by effect". For example - have `hincrby`
always propagate either HSET or HSETEX. This will not solve the '#(4)'
problem above though, for which we might HAVE to propagate `hdel`

I tend to go with the second option. The reason is that it is expected
to have less impact on replication stream and should include less
processing time on the replicas and network traffic. Specifically for
HSETEX with KEEPTTL we will have to propagate the `hdel` in case we
overwritten an expired field, but that would help limit the impact of
this propagation.

---------

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Co-authored-by: Sourav Singh Rawat <aidenfrostbite@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
ranshid added a commit that referenced this pull request Jan 29, 2026
There are currently several issues with the existing hash field
expiration mechanism:
1. `HINCRBY` is propagated to the replica "as-is". It mean it relies on
the fact that the state of the hash is the same on the primary and the
replica. HFE did change this assumption as the field might be expired
only when the replica will handle the propagated `hincrby`. the problem
is that the replica does not "expire" fields by it's own. it needs to
respect the request from the primary and always try to use the existing
field. This can lead to either miss-alignment with the value on the
primary and the replica AND even a disconnection since the replica might
hold and "expired" field which is not in "integer" format...
2. HINCRBYFLOAT is currently ALWAYS propagating `hset` - this means that
the expiration time of an entry will always be removed on the replica
side (it needs to propagate HSETEX when expiration time needs to be
maintained)
3. Currently all hash write commands which are mutating values might
overwrite an expired field. In such cases the existing implementation
will "silently" do so. The problem is that the user will not get any
key-space-notificaiton explaining the reason for the behavior. For
example, when `hincrby` is issued overwriting an expired field which was
not yet "cleaned" by active-expiration it will reset the counter to '0'
before incrementing it. this means that the user might ask: why is the
value '1' and not bigger, "I did not see any notification that the old
value expired"...
4. HSETEX with KEEPTTL suffers from a "somewhat" similar problem as
if the primary "replaced" the entry which is expired now but might not
have been expired when the primary applied it.

There are 2 options for a solution:
1. we could propagate `hdel` for every entry we are "overwritting"
(batch them if we can)
2. propagate the commands "by effect". For example - have `hincrby`
always propagate either HSET or HSETEX. This will not solve the '#(4)'
problem above though, for which we might HAVE to propagate `hdel`

I tend to go with the second option. The reason is that it is expected
to have less impact on replication stream and should include less
processing time on the replicas and network traffic. Specifically for
HSETEX with KEEPTTL we will have to propagate the `hdel` in case we
overwritten an expired field, but that would help limit the impact of
this propagation.

---------

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Co-authored-by: Sourav Singh Rawat <aidenfrostbite@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
harrylin98 pushed a commit to harrylin98/valkey_forked that referenced this pull request Feb 19, 2026
…ey-io#3060)

There are currently several issues with the existing hash field
expiration mechanism:
1. `HINCRBY` is propagated to the replica "as-is". It mean it relies on
the fact that the state of the hash is the same on the primary and the
replica. HFE did change this assumption as the field might be expired
only when the replica will handle the propagated `hincrby`. the problem
is that the replica does not "expire" fields by it's own. it needs to
respect the request from the primary and always try to use the existing
field. This can lead to either miss-alignment with the value on the
primary and the replica AND even a disconnection since the replica might
hold and "expired" field which is not in "integer" format...
2. HINCRBYFLOAT is currently ALWAYS propagating `hset` - this means that
the expiration time of an entry will always be removed on the replica
side (it needs to propagate HSETEX when expiration time needs to be
maintained)
3. Currently all hash write commands which are mutating values might
overwrite an expired field. In such cases the existing implementation
will "silently" do so. The problem is that the user will not get any
key-space-notificaiton explaining the reason for the behavior. For
example, when `hincrby` is issued overwriting an expired field which was
not yet "cleaned" by active-expiration it will reset the counter to '0'
before incrementing it. this means that the user might ask: why is the
value '1' and not bigger, "I did not see any notification that the old
value expired"...
4. HSETEX with KEEPTTL suffers from a "somewhat" similar problem as
#(1). the replica will receive the propagated command, but will not know
if the primary "replaced" the entry which is expired now but might not
have been expired when the primary applied it.

There are 2 options for a solution:
1. we could propagate `hdel` for every entry we are "overwritting"
(batch them if we can)
2. propagate the commands "by effect". For example - have `hincrby`
always propagate either HSET or HSETEX. This will not solve the '#(4)'
problem above though, for which we might HAVE to propagate `hdel`

I tend to go with the second option. The reason is that it is expected
to have less impact on replication stream and should include less
processing time on the replicas and network traffic. Specifically for
HSETEX with KEEPTTL we will have to propagate the `hdel` in case we
overwritten an expired field, but that would help limit the impact of
this propagation.

---------

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Co-authored-by: Sourav Singh Rawat <aidenfrostbite@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working release-notes This issue should get a line item in the release notes

Projects

Status: 9.0.2
Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants