Description
I am currently developing a firmware for the ESP that needs to serve the camera feed as HTTP stream. For testing purposes, I reduced that further down to serving a simple JPEG animation. I initially used edge-executor::LocalExecutor
to run my custom async HTTP server, handling every connection in a separate future spawned to the same thread-local executor. One stream to one client is therefore one (possibly infinitely) running async task.
Unfortunately, I ran into some weird issues: When running multiple streams in parallel, any stream might freeze randomly and never get back to live. Sometimes, the whole ESP crashes with some unhandled exception. It does not seem to be the same exception every time, but here is one:
Guru Meditation Error: Core 0 panic'ed (Unhandled debug exception).
Debug exception reason: BREAK instr
Core 0 register dump:
PC : 0x420171d7 PS : 0x00060b36 A0 : 0x82010e96 A1 : 0x3fccf060
0x420171d7 - alloc::task::raw_waker::clone_waker
at ??:??
A2 : 0x3fcd1154 A3 : 0x3fcd1154 A4 : 0x00000000 A5 : 0x3fccf3a0
A6 : 0x3fc9ade8 A7 : 0x3fca12dc A8 : 0xfefefefe A9 : 0xffffffff
0x3fc9ade8 - xKernelLock
at ??:??
0x3fca12dc - pxReadyTasksLists
at ??:??
A10 : 0xfefefefe A11 : 0xfefefefe A12 : 0xfefefefe A13 : 0x00060b23
A14 : 0xfffffffe A15 : 0x0000cdcd SAR : 0x00000020 EXCCAUSE: 0x00000001
EXCVADDR: 0x00000000 LBEG : 0x40056f5c LEND : 0x40056f72 LCOUNT : 0xffffffff
Backtrace: 0x420171d4:0x3fccf060 0x42010e93:0x3fccf080 0x4201800a:0x3fccf9c0 0x4202bb7f:0x3fccf9f0 0x42030838:0x3fccfa10
0x420171d4 - alloc::task::raw_waker::clone_waker
at ??:??
0x42010e93 - std::sys_common::backtrace::__rust_begin_short_backtrace
at ??:??
0x4201800a - core::ops::function::FnOnce::call_once{{vtable.shim}}
at ??:??
0x4202bb7f - std::sys::pal::unix::thread::Thread::new::thread_start
at ??:??
0x42030838 - pthread_task_func
at ??:??
At some point I tried using a different executor. After replacing edge-executor
by futures-executor::LocalPool
, I could not reproduce these issues anymore. The streams seem to continue running without problems. Then, I've written my own version of LocalExecutor
from scratch without using any third-party crates, and this also seems to work fine.
This makes me believe, that edge-executor
has a bug... somewhere. I have not really a clue where it is. I just found this open issue in async-task
talking about tasks not getting rescheduled randomly. That would fit my observations, at least.
Sorry for not providing sample code. I am developing the firmware as employee and therefore cannot just share our product's firmware. If you are interested, I can however try to build a minimal example, when I find the time.