Skip to content

Conversation

@gabsuren
Copy link
Collaborator

@gabsuren gabsuren commented Oct 28, 2025

Description

This PR fixes critical memory leaks and crashes in the ESP WebSocket client that occur during reconnection scenarios(CONFIG_ESP_WS_CLIENT_SEPARATE_TX_LOCK = y).

  • Double-free crashes: Heap corruption during abort/reconnect scenarios
  • Data loss: First packet after reconnection not received
  • Error buffer accumulation: 2KB memory leak on disconnect

Changes Made:

  • Add state-based protection in esp_websocket_client_abort_connection() to prevent double-close
  • Reset frame parsing state variables in esp_websocket_client_task() before dispatching WEBSOCKET_EVENT_CONNECTED
  • Free errormsg_buffer in esp_websocket_client_abort_connection() when client disconnects
  • Destroy client->transport_list immediately in esp_websocket_client_stop()
  • Fix transport layer resource cleanup in ws_close() function

Related

#898

Checklist

Before submitting a Pull Request, please ensure the following:

  • 🚨 This PR does not introduce breaking changes.
  • [ ✓ ] All CI checks (GH Actions) pass.
  • [ ✓] Documentation is updated as needed.
  • Tests are updated or added as necessary.
  • [ ✓] Code is well-commented, especially in complex areas.
  • [ ✓] Git history is clean — commits are squashed to the minimum necessary.

abort(CONFIG_ESP_WS_CLIENT_SEPARATE_TX_LOCK = y)
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

cursor[bot]

This comment was marked as outdated.

@gabsuren gabsuren changed the title Fix/ws race on abort fix(websocket): Fix websocket client race on abort and memory leak(IDFGH-16555) Oct 28, 2025
@gabsuren gabsuren force-pushed the fix/ws_race_on_abort branch 3 times, most recently from 67bd7e3 to 46871bf Compare October 28, 2025 13:09
#else
// When separate TX lock is not configured, we already hold client->lock
// which protects the transport, so we can send PONG directly
esp_transport_ws_send_raw(client->transport, WS_TRANSPORT_OPCODES_PONG | WS_TRANSPORT_OPCODES_FIN, data, client->payload_len,

Check warning

Code scanning / clang-tidy

The value '138' provided to the cast expression is not in the valid range of values for 'ws_transport_opcodes' [clang-analyzer-optin.core.EnumCastOutOfRange] Warning

The value '138' provided to the cast expression is not in the valid range of values for 'ws_transport_opcodes' [clang-analyzer-optin.core.EnumCastOutOfRange]
@gabsuren gabsuren requested a review from david-cermak October 29, 2025 09:09
@gabsuren gabsuren force-pushed the fix/ws_race_on_abort branch from 46871bf to 5577e03 Compare October 29, 2025 10:54
cursor[bot]

This comment was marked as outdated.

@gabsuren gabsuren force-pushed the fix/ws_race_on_abort branch from 5577e03 to 082b119 Compare October 29, 2025 11:58
1. Reset frame parsing state (payload_len, payload_offset, last_opcode) on new connection
2. Free errormsg_buffer in esp_websocket_client_disconnect()
3. Added sdkconfig.ci.tx_lock config
@gabsuren gabsuren force-pushed the fix/ws_race_on_abort branch from 082b119 to bff5cbe Compare October 29, 2025 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants