Fix: resolve doubly linked list push issue#3085
Conversation
|
@ntvviktor looks like you broke the tests |
hey nkaradzhov, I am considering, even though fixing the code logic of the DoublyLinkedList will properly shut down the bad host request for redis connection, but it will break the sentinel client tests. I believe it has been somehow coded to pass with the error of issue: TypeError: Cannot set properties of undefined (setting 'next')
File "/app/node_modules/.pnpm/@redis+client@5.8.0/node_modules/@redis/client/dist/lib/client/linked-list.js", line 76, col 32, in DoublyLinkedList.remove
node.previous.next = node.next;I will investigate further to try to triage and fix the root cause. |
|
In the case of the issue above, seems like the async #create() {
const node = this._self.#clientsInUse.push(
this._self.#clientFactory()
.on('error', (err: Error) => this.emit('error', err))
);
try {
const client = node.value;
await client.connect();
} catch (err) {
this._self.#clientsInUse.remove(node);
throw err;
}
this._self.#returnClient(node);
}I fix using the if statement to check for the previous is not enough, and found out a root cause from RedisCommandQueue also using Doubly Linked List. I ran all tests locally and it currently works as expected. @nkaradzhov Please help me enable the pipeline to see whether my fix break anything. Thanks! |
|
Hi @nkaradzhov, please help me review the code, thanks in advance |
|
@ntvviktor the PR looks to be damaged somehow, it looks like ther are some unwanted changes after a merge? |
c03c34f to
234f1b6
Compare
Hi @nkaradzhov thanks for feedback, I have made changes, my bad on the rebase master branch into this. |
|
@ntvviktor sorry for being paranoid here, but could you please provide a test case for the DoublyLinkedList somewhere here |
Thanks @nkaradzhov, I have added the according tests. |
|
Looks like this causes #3127 |
|
@ntvviktor, sorry, what was the reason for the addition of |
|
Hey @nkaradzhov, it was because during this test case, after a master is shutdown and the test trying to reconnect, the sentinelSentinels or some other utils could use the first item [0] in the list, but it already shutdown, hence the when the frame trying to get |
…ent fails to reconnect, throwing `"None of the sentinels are available"`. This is a regression introduced in v5.9.0 (commit `73413e086c` — "fix: resolve doubly linked list push issue (redis#3085)"). **Root Cause:** `#handleSentinelFailure()` (added in v5.9.0) permanently removes the sentinel node from `#sentinelRootNodes` via `splice()` when its connection drops. When `#reset()` then calls `observe()`, the sentinel list is empty (or missing that node), so reconnection is impossible. **Secondary issue:** The sentinel list update in `transform()` (line 1350) compares only array lengths (`analyzed.sentinelList.length != this.#sentinelRootNodes.length`), so even if reconnection succeeds via another sentinel, a removed node may never be restored if the list sizes happen to match. - [ ] **Task 1:** Fix `#handleSentinelFailure` — stop permanently removing sentinel nodes - **File:** `packages/client/lib/sentinel/index.ts` (lines 934–942) - Remove the `splice()` call. A transient TCP disconnect (which happens during sentinel restart) should not permanently delete the node from the root list. Just call `#reset()` to trigger reconnection, matching v5.8.3 behavior. - Before (broken): ```typescript #handleSentinelFailure(node: RedisNode) { const found = this.#sentinelRootNodes.findIndex( (rootNode) => rootNode.host === node.host && rootNode.port === node.port ); if (found !== -1) { this.#sentinelRootNodes.splice(found, 1); } this.#reset(); } ``` - After (fixed): ```typescript #handleSentinelFailure(node: RedisNode) { this.#reset(); } ``` - [ ] **Task 2:** Fix sentinel list update condition in `transform()` - **File:** `packages/client/lib/sentinel/index.ts` (line 1350) - Replace the length-only comparison with a proper content comparison so that nodes removed or added are always detected. - Before: ```typescript if (analyzed.sentinelList.length != this.#sentinelRootNodes.length) { ``` - After — compare actual content: ```typescript if (!areSentinelListsEqual(analyzed.sentinelList, this.#sentinelRootNodes)) { ``` - Add a helper (either inline or as a small private method) that compares host+port of each node, not just array length. - [ ] **Task 3:** Run existing tests to confirm no regressions - Run `npm test` in the `packages/client` directory - Look for and run any sentinel-specific test files - [ ] **Task 4:** Verify the fix scenario - Confirm `#sentinelRootNodes` retains the original sentinel entry after disconnect - Confirm `observe()` retries that sentinel and succeeds once it's back online - Confirm no `"None of the sentinels are available"` error after sentinel restart
Description
Now the client pool properly shut down after getting a bad connection.
Checklist
npm testpass with this change (including linting)?