Skip to content

Redis 8.2.1#28

Open
tomerqodo wants to merge 4 commits intocoderabbit_full_base_redis_821_pr6from
coderabbit_full_head_redis_821_pr6
Open

Redis 8.2.1#28
tomerqodo wants to merge 4 commits intocoderabbit_full_base_redis_821_pr6from
coderabbit_full_head_redis_821_pr6

Conversation

@tomerqodo
Copy link

@tomerqodo tomerqodo commented Jan 29, 2026

Benchmark PR from agentic-review-benchmarks#6

Summary by CodeRabbit

  • Bug Fixes

    • Fixed potential crashes related to stream operations and database flushing scenarios
    • Resolved stream entry reference handling during replication
  • Performance and resource utilization

    • Optimized active defragmentation behavior during replica database synchronization
    • Improved key size information tracking in cluster deployments
  • Version

    • Updated to version 8.2.1

✏️ Tip: You can customize this high-level summary in your review settings.

sundb and others added 4 commits January 25, 2026 12:12
…s#14274)

Fix redis#14267
This bug was introduced by redis#13495

### Summary

When a replica clears a large database, it periodically calls
processEventsWhileBlocked() in the replicationEmptyDbCallback() callback
during the key deletion process.
If defragmentation is enabled, this means that active defrag can be
triggered while the database is being deleted.
The defragmentation process may also modify the database at this time,
which could lead to crashes when the database is accessed after
defragmentation.

Code Path:
```
replicationEmptyDbCallback() -> processEventsWhileBlocked() -> whileBlockedCron() -> defragWhileBlocked()
```

### Solution

This PR temporarily disables active defrag before emptying the database,
then restores the active defrag setting after the empty is complete.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…fter reload (redis#14276)

This bug was introduced by redis#14130
found by @oranagra

### Summary

Because `s->cgroup_ref` is created at runtime the first time a consumer
group is linked with a message, but it is not released when all
references are removed.

However, after `debug reload` or restart, if the PEL is empty (meaning
no consumer group is referencing any message), `s->cgroup_ref` will not
be recreated.

As a result, when executing XADD or XTRIM with `ACKED` option and
checking whether a message that is being read but has not been ACKed can
be deleted, the cgroup_ref being NULL will cause a crash.

### Code Path
```
xaddCommand -> streamTrim -> streamEntryIsReferenced
```

### Solution

Check if `s->cgroup_ref` is NULL in streamEntryIsReferenced().
@coderabbitai
Copy link

coderabbitai bot commented Jan 29, 2026

Walkthrough

This patch release (8.2.1) addresses critical bugs: prevents active defragmentation during replication flush to avoid crashes, fixes potential crashes with XADD/ACKED after DEBUG RELOAD, and includes version bumps with comprehensive test coverage for the fixes.

Changes

Cohort / File(s) Summary
Version Update
src/version.h, 00-RELEASENOTES
Bumped version from 8.2.0 to 8.2.1 with numeric macro update. Added 8.2.1 release notes detailing bug fixes for histogram handling, defrag protection during replication, and crash prevention.
Replication Defragmentation Fix
src/replication.c, tests/unit/memefficiency.tcl
Added protective mechanism to disable active defragmentation during database flush in replication. Included new helper procedure discard_replies_every() to reduce test code duplication and new regression test for issue #14267.
Stream Entry Reference Fix
src/t_stream.c, tests/unit/type/stream.tcl
Added early return in streamEntryIsReferenced() when cgroups reference is NULL, preventing crashes. Added regression test for XADD with ACKED option after DEBUG RELOAD.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Version eight-point-two-point-one hops in with care,
Defragmentation paused when replicas flush their lair,
Streams no longer crash when ACKED entries appear,
A sturdy patch release that's perfectly clear! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'Redis 8.2.1' clearly and concisely summarizes the main change: releasing Redis version 8.2.1.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/replication.c`:
- Around line 1953-1962: The saved variable orig_active_defrag is meant to
restore the previous state but is currently unused and the code always sets
server.active_defrag_enabled = 1; instead of restoring the original value;
replace the final assignment with server.active_defrag_enabled =
orig_active_defrag so the saved value is used (symbols: orig_active_defrag,
server.active_defrag_enabled) and leave the surrounding block that disables
defrag around emptyData(..., replicationEmptyDbCallback) as-is.

In `@tests/unit/memefficiency.tcl`:
- Around line 70-76: The condition in proc discard_replies_every is inverted:
change the if test from checking "$count % $frequency != 0" to "$count %
$frequency == 0" so the discard loop runs only when count is a multiple of
frequency; keep the inner for-loop ($rd read) as-is to discard discard_num
replies when that multiple occurs, referencing the proc name
discard_replies_every and the variables count and frequency to locate the
change.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a5747eb and 3d3ed35.

📒 Files selected for processing (6)
  • 00-RELEASENOTES
  • src/replication.c
  • src/t_stream.c
  • src/version.h
  • tests/unit/memefficiency.tcl
  • tests/unit/type/stream.tcl
🧰 Additional context used
🧬 Code graph analysis (1)
src/replication.c (1)
src/db.c (1)
  • emptyData (862-900)
🪛 GitHub Actions: CI
src/replication.c

[error] 1956-1956: replication.c:1956:9: error: unused variable 'orig_active_defrag' [-Werror=unused-variable]

🪛 GitHub Actions: External Server Tests
src/replication.c

[error] 1956-1956: Unused variable 'orig_active_defrag' in rdbLoadEmptyDbFunc; -Werror treats this as an error. Remove or use the variable.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Analyze (cpp)
  • GitHub Check: code-coverage
  • GitHub Check: code-coverage
🔇 Additional comments (4)
src/t_stream.c (1)

2707-2711: Conservative NULL-guard for cgroups_ref looks correct.
Avoids a crash path when the reference index is absent.

src/version.h (1)

1-3: Version bump is consistent and clear.

tests/unit/memefficiency.tcl (2)

350-350: Usage affected by the helper bug.

This call will malfunction due to the inverted condition in discard_replies_every. Once the helper is fixed, this usage should work correctly.


945-1007: Good regression test for issue #14267.

The test appropriately validates the fix by:

  1. Creating fragmentation conditions that would trigger defrag
  2. Enabling defrag then immediately starting replication (which flushes the DB)
  3. Verifying the system doesn't crash and sync completes

Note: The discard_replies_every calls on lines 967 and 977 depend on fixing the inverted condition bug in the helper procedure.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +1953 to +1962
/* Temporarily disable active defragmentation during database flush.
* This prevents defrag from being triggered in replicationEmptyDbCallback()
* which could modify the database while it's being emptied. */
int orig_active_defrag = server.active_defrag_enabled;
server.active_defrag_enabled = 0;

emptyData(-1, empty_db_flags, replicationEmptyDbCallback);

/* Restore the original active defragmentation setting. */
server.active_defrag_enabled = 1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Restore active_defrag_enabled to the original value (and fix the build).

Line 1956 currently triggers a -Werror unused-variable failure, and Line 1962 always enables defrag even if it was disabled before the flush. Use the saved value instead.

🛠️ Proposed fix
     /* Temporarily disable active defragmentation during database flush.
      * This prevents defrag from being triggered in replicationEmptyDbCallback()
      * which could modify the database while it's being emptied. */
     int orig_active_defrag = server.active_defrag_enabled;
     server.active_defrag_enabled = 0;

     emptyData(-1, empty_db_flags, replicationEmptyDbCallback);

     /* Restore the original active defragmentation setting. */
-    server.active_defrag_enabled = 1;
+    server.active_defrag_enabled = orig_active_defrag;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/* Temporarily disable active defragmentation during database flush.
* This prevents defrag from being triggered in replicationEmptyDbCallback()
* which could modify the database while it's being emptied. */
int orig_active_defrag = server.active_defrag_enabled;
server.active_defrag_enabled = 0;
emptyData(-1, empty_db_flags, replicationEmptyDbCallback);
/* Restore the original active defragmentation setting. */
server.active_defrag_enabled = 1;
/* Temporarily disable active defragmentation during database flush.
* This prevents defrag from being triggered in replicationEmptyDbCallback()
* which could modify the database while it's being emptied. */
int orig_active_defrag = server.active_defrag_enabled;
server.active_defrag_enabled = 0;
emptyData(-1, empty_db_flags, replicationEmptyDbCallback);
/* Restore the original active defragmentation setting. */
server.active_defrag_enabled = orig_active_defrag;
🧰 Tools
🪛 GitHub Actions: CI

[error] 1956-1956: replication.c:1956:9: error: unused variable 'orig_active_defrag' [-Werror=unused-variable]

🪛 GitHub Actions: External Server Tests

[error] 1956-1956: Unused variable 'orig_active_defrag' in rdbLoadEmptyDbFunc; -Werror treats this as an error. Remove or use the variable.

🤖 Prompt for AI Agents
In `@src/replication.c` around lines 1953 - 1962, The saved variable
orig_active_defrag is meant to restore the previous state but is currently
unused and the code always sets server.active_defrag_enabled = 1; instead of
restoring the original value; replace the final assignment with
server.active_defrag_enabled = orig_active_defrag so the saved value is used
(symbols: orig_active_defrag, server.active_defrag_enabled) and leave the
surrounding block that disables defrag around emptyData(...,
replicationEmptyDbCallback) as-is.

Comment on lines +70 to +76
proc discard_replies_every {rd count frequency discard_num} {
if {$count % $frequency != 0} {
for {set k 0} {$k < $discard_num} {incr k} {
$rd read ; # Discard replies
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Inverted condition: != 0 should be == 0.

The condition $count % $frequency != 0 causes replies to be discarded on every iteration except multiples of frequency. Based on the original pattern (if {$count % 10000 == 0}), the intent is to discard replies only when count reaches a multiple of frequency.

With the current logic, on the first iteration (count=1), it attempts to read 10000 replies when only 1 command has been sent, which would block or error.

Proposed fix
 proc discard_replies_every {rd count frequency discard_num} {
-    if {$count % $frequency != 0} {
+    if {$count % $frequency == 0} {
         for {set k 0} {$k < $discard_num} {incr k} {
             $rd read ; # Discard replies
         }
     }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
proc discard_replies_every {rd count frequency discard_num} {
if {$count % $frequency != 0} {
for {set k 0} {$k < $discard_num} {incr k} {
$rd read ; # Discard replies
}
}
}
proc discard_replies_every {rd count frequency discard_num} {
if {$count % $frequency == 0} {
for {set k 0} {$k < $discard_num} {incr k} {
$rd read ; # Discard replies
}
}
}
🤖 Prompt for AI Agents
In `@tests/unit/memefficiency.tcl` around lines 70 - 76, The condition in proc
discard_replies_every is inverted: change the if test from checking "$count %
$frequency != 0" to "$count % $frequency == 0" so the discard loop runs only
when count is a multiple of frequency; keep the inner for-loop ($rd read) as-is
to discard discard_num replies when that multiple occurs, referencing the proc
name discard_replies_every and the variables count and frequency to locate the
change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants