-
Notifications
You must be signed in to change notification settings - Fork 8.3k
shell: telnet: Don't close the connection on ENOBUFS error #67596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shell: telnet: Don't close the connection on ENOBUFS error #67596
Conversation
If there's not enough networking buffers at certain moment,
they might become available later. So instead of closing connection
(and failing assertation) sleep and retry. This avoid the following
assertion failure when there's much of data to send:
net_pkt: Data buffer (1500) allocation failed.
net_tcp: conn: 0x20076024 packet allocation failed, len=1460
shell_telnet: Failed to send -105, shutting down
ASSERTION FAIL [err == 0] @ .../subsys/shell/shell_ops.c:416
os: r0/a1: 0x00000004 r1/a2: 0x000001a0 r2/a3: 0x00000004
os: r3/a4: 0x20044380 r12/ip: 0x00001958 r14/lr: 0x080c9027
os: xpsr: 0x41000000
os: Faulting instruction address (r15/pc): 0x0811ed26
os: >>> ZEPHYR FATAL ERROR 4: Kernel panic on CPU 0
os: Current thread: 0x20045100 (shell_telnet)
os: Halting system
Signed-off-by: Miika Karanki <miika.karanki@vaisala.com>
d9ff918 to
a78f44c
Compare
|
@rlubos and @jukkar : could you also take a look as you've been reviewing #66287 as well? I'm bit unsure is it ok as a principle to handle the buffer allocation problem this way as the root cause of the assertion failure this is addressing is running out of net buffers, but in our case this seems to solve the issue, nicely logging the buffer allocation error as well over the telnet if the shell logging is enabled. As the shell is often used as a debug/development tool, crashing due to the shell not being able to send data is bit inconvenient when assertions are enabled. |
rlubos
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks ok (running out of buffers shouldn't be the reason to close the connection). However that assert at the shell level seems concerning to me, if I understand correctly it would hit as well if the peer simply closes the connection. Perhaps we should have some connection recovery mechanism at the telnet shell backend (just thinking aloud here, of course not asking to handle it in this PR).
Also at that point, probably the |
jukkar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. It might make sense to have some limiter here, so if the system is out-of-buffers, we do not sleep forever. Just a suggestion for future PR.
You're right @rlubos ! Simply doing ... causes the assertion failure due to I'm also seeing another failure sometimes before the above happens: ... but I'm using Zephyr 3.4.0 here and there's lots changes since that in net_context.c so that might not be relevant anymore. Retesting with 3.5.0 requires a bit of work for me. |
|
I am not seeing errors with latest upstream in qemu_x86. Tried with this and with this command |
|
But with |
|
I created a bug report of the crash issue in #67637 |
If there's not enough networking buffers at certain moment, they might become available later. So instead of closing connection (and failing assertation) sleep and retry. This avoid the following assertion failure when there's much of data to send:
This is related to PR #66287 which implemented not closing the connection if write fails due to EAGAIN.