-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Description
Describe the bug
Using the modbus subsystem, a response returned after a request times out would increment the semaphore and stay until the next request is made which will immediately return when k_sem_take is called even before a response is returned. This will once again have the same problem when the actual response arrives. The result is that the modbus client will temporarily "soft lock".
To fix this, the wait semaphore just needs to be reset before transmitting.
To Reproduce
I don't have an easy way to reproduce this yet. When I have time, I'll create a test case in the modbus twister test to reproduce this behavior.
Expected behavior
Modbus client should recover for the next transmission, and ignore previously received information.
Impact
In our case, this temporarily blocks any and all communication with the target device for several if not tens of seconds. Theoretically, this particular lockup could happen indefinitely.
Logs and console output
This is an annotated log of what is happening, with a couple additional logs added to the modbus source code to make it clearer.
Start of Transmission:
[00:00:57.281,000] <dbg> modbus_serial: rtu_tx_adu: uart_buf
01 03 84 f8 00 01 2d 0b |......-.
[00:00:57.281,000] <dbg> modbus_serial: rtu_tx_adu: Start frame transmission
Reception of Response:
[00:00:57.294,000] <dbg> modbus_serial: rtu_tmr_handler: uart_rx_buf
01 03 02 00 04 b9 87 |.......
Processed in Work Queue:
[00:00:57.294,000] <err> modbus_serial: modbus_rtu_rx_adu
Start of Transmission:
[00:00:57.320,000] <dbg> modbus_serial: rtu_tx_adu: uart_buf
01 03 84 f9 00 01 7c cb |......|.
[00:00:57.320,000] <dbg> modbus_serial: rtu_tx_adu: Start frame transmission
Reception of Response:
[00:00:57.333,000] <dbg> modbus_serial: rtu_tmr_handler: uart_rx_buf
01 03 02 00 00 b8 44 |......D
No Processing in Work Queue
Start of Transmission:
[00:00:57.495,000] <dbg> modbus_serial: rtu_tx_adu: uart_buf
01 03 84 fa 00 01 8c cb |........
[00:00:57.495,000] <dbg> modbus_serial: rtu_tx_adu: Start frame transmission
Reception of Response:
[00:00:57.508,000] <dbg> modbus_serial: rtu_tmr_handler: uart_rx_buf
01 03 02 00 00 b8 44 |......D
No Processing in Work Queue
Repeat adnausium....
Start of Transmission:
[00:01:00.165,000] <dbg> modbus_serial: rtu_tx_adu: uart_buf
01 03 85 0f 00 01 9d 05 |........
[00:01:00.165,000] <dbg> modbus_serial: rtu_tx_adu: Start frame transmission
Reception of Response:
[00:01:00.179,000] <dbg> modbus_serial: rtu_tmr_handler: uart_rx_buf
01 03 02 1a 0c b3 21 |......!
Finally Processed in Work Queue. Delayed for almost 3 seconds. By this point, the last request has already timed out, but the semaphore is incremented anyway.
[00:01:00.326,000] <err> modbus_serial: modbus_rtu_rx_adu
Start of Transmission:
[00:01:00.340,000] <dbg> modbus_serial: rtu_tx_adu: uart_buf
02 03 85 0f 00 01 9d 36 |.......6
[00:01:00.340,000] <dbg> modbus_serial: rtu_tx_adu: Start frame transmission
Immediate "Response" since the semaphore is incremented, but with the response to the previous entry.
[00:01:00.340,000] <err> modbus_c: Failed to validate unit ID or function code
Reception of Response:
[00:01:00.353,000] <dbg> modbus_serial: rtu_tmr_handler: uart_rx_buf
02 03 02 1a 29 36 fa |....)6.
Processed in Work Queue, but already aborted earlier, so it causes the problem again:
[00:01:00.353,000] <err> modbus_serial: modbus_rtu_rx_adu
Repeat adnausium....
Environment:
- OS: Windows 11
- Toolchain ARM GNU Toolchain 13.2.rel1
- 3.5.0-rc3 bbf46b9