-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
debug options change code behaviour on certain boards #6177
Comments
To summarize:
Some questions:
Comments:
The location is defined by flash size ( |
Q1 I have changed a lot of things since I used 2.4.2. I really don't want to have to revert to it if I can avoid it. I am unable to comment re 2.4.2. As per image included in OP: Your summary is broadly correct. (its still an issue WITH debug mode, just not terminal!) |
Still trying to understand PUYA . but if it helps. the flash chip on the Basics are XT BN25F08, same as on the SVs and on my S20s which I forgot to mention, which also work just fine. |
So it is not a PUYA flash chip and this workaround is not needed (platform.local.txt is an arduino IDE file, Are you able to run WiFi examples on the faulty-with-your-sketch board ? Can you try with DOUT instead of QIO ? (sorry I hadn't seen the image) |
Recompiled with DOUT, no change. Ran a totally different minimal WiFi sketch on the Basic...runs perfectly... I'm going to revert to 2.4.2 and see what happens |
The 2.4.2 behaviour is different: We no longer get the STA_MODE_DISCONNECTED loop:
But the behaviour is equally bizarre: After a couple of reboots to confirm all was well, it stopped responding. My own debug messages show connected with an IP but pings timed out and no webserver visible. Then it went into a slower version of the 2.5.2 behaviour:
While actually getting an IP address in between each, unlike 2.5.2 But when compiled with debug + WiFi, we get
|
For the record, still on 2.4.2. exact same code as above, debug + wifi options removed, recompiled onto ESP-01S webUI response is practically instantaneous, i.e. code working 100% as expected. What is it about the basic that causes such wifi problems and is also different between 2.4.2 and 2.5.2? |
Further info: I compiled it for a generic esp8285 with all the rest of the settings the same and it works fine. If I compile it as a generic esp8266 it runs like a dog, fails, locks up, wdt resets etc etc. I'm now going through the compiler options with a fine tooth comb to see what might be different |
esp8285 is supposed to be an esp8266 with DOUT forced, only 1MB (no choice), default fw (2.2.1, no choice), flash frequency forced to 40. What is your esp01s flash size as shown by esptool.py ? |
I can confirm similar behaviour.
However, the sketch behaves normally (nicely catching the website exception without crashing the microcontroller) if I enable a debug option, even if it is one that I don't use (eg, OTA). Compiling for ESP8285 also solves my problem. Hardware: ESP12F Settings when problem occurs: |
It sounds a bit like stuff has been optimized which may put a higher strain on the flash. About the crash on failing GET requests. |
I tried the timeout and the delay already, but they don't help, the exception is just delayed. How can I change the flash frequency? |
I guess that's probably a bug on its own. About the flash frequency. |
I will try to find another flash frequency after my holidays. |
@mindstormsking can you repeat the bug with master git version of the core ? |
Debug mode reduces the free heap somewhat, due to the memory manager adding poison blocks around each internal memory segment. So it's very possible for a sketch to run out of memory and crash if allocations aren't checked. |
@earlephilhower I understand, but the thing is that the debug mode actually solves the problem. Compiling for an ESP8285 also solves the problem... @d-a-v I'll see what I can do, after my holidays. |
@TD-er: The problem occurs with a flash frequency of 40MHz as well as 80MHz. @d-a-v: When I enable OOM, the problem does not occur anymore. The problem does occur when I use the Git Master branch, so that doesn't solve the problem. I compared the verbose output of compiling for a 8266 and 8285 and noticed that different build options are used, although they are not changed in the Arduino GUI. Could the problem lay there?
Versus:
|
Hello I also the same problem in an ESP8285 if I compile the chip with 2.5.0 it works without problems. If I compile with a higher version it doesn't work. I have returned to version 2.5.0, |
I was using 2.6.2 when I got this error as well with an ESP8622E. Switching back to 2.5.0 also fixed the restarting issue with this exception for me. |
Still hoping that someone has a brillant idea to fix this issue. I am having the same problem as mentioned above, i.e. I get crashes with a stacktrace showing
that happen on rare occasions and seem to happen mostly when the WiFi connection is disturbed and / or the server which I am trying to send data to (using HTTPClient) does not respons in time and / or correctly. |
@philbowles is this problem still present in 2.6.3? |
Sorry for the delay, it came down eventually (I think) to using incompatible FLASH mode in the build options - either some of my boards are "fake"ish and use e.g. DOUT when the genuine article and standard definitions state DIQ or QIO or that irrelevant but they had dodgy flash chips. During that phase of my testing, I just set everything to DOUT and accepted the performance "hit" ( not that it was noticeable). Oddly it seemed to happen more often using lwIP v2 Lower Memory. Either way it appears to be a "dodgy flash" issue - it was working but "randomly" (these things rarely are :) ) running really really really slowly then on next boot worked fine...for a while etc I haven't seen anything like it for a long while (I'm on 2.6.3 since it was released) but I now always use lwIP v2 Higher bandwith no features...so who knows? If it reappears and I can pin it down...I'll come back |
Thanks for the answer! |
Basic Infos
Platform
Hardware: Various (see text)
Core Version: SDK:2.2.1(cfd48f3)/Core:2.5.2=20502000/lwIP:STABLE-2_1_2_RELEASE/glue:1.1-7-g82abda3/BearSSL:a143020
Development Env: Arduino IDE 1.8.9
Operating System: Windoze10
Settings in IDE
Various (see text) but "working" example is:
Problem Description
I have come across some very "odd" behaviour which may be related to / shed light on e.g. #5784 and/or #5736
I have a body of code designed to run on a variety of boards. It is 4500+ lines and therefore impractical to include. However , I don't think it is all that relevant, since it runs fine on:
The code relies on a previous successful connect (goes into AP mode if none) and monitors WiFi events to "bring up" webserver, MQTT client on "got IP". It has various #defines to tailor various H/W differences, but the core wifi / webserver / mqtt functionality is(should be!) same for all devices.
Since upgrading to 2.5.0 (and through 2.5.2) I have had serious problems with the exact same code on SONOFF Basic(1M/128K).
If I compile it with options as show above, it runs, but incredibly slowly. The home webpage which takes about 1-2s max on all other devices can take 30-40 seconds to load, causing other
dependent and/or time critical functions to fail. One time out of three or four it will manage to load the webUI to the point where it is functional, but slower than drying paint...
If I compile it WITHOUT the debug options a few seconds later (i.e. everything including the ambient temperature is identical) it simply refuses to connect:
(In my debug printf, T=millis() FH=ESP.getFreeHeap())
If I copy this compiled binary and manually upload it to the SONOFF SV, its runs flawlessly. If I compile it to the SV, an ESP-01S or any other device, it runs flawlessly.
It is only the Basic(old-style V1) that exhibits this problem. I have 3 and they all behave exactly the same way, having previously been installed with earlier versions of my software and working fine for well over a year.
Interestingly is is not always the WIFI_EVENT_STAMODE_DISCONNECTED loop that prevents connection. On other occasions is appears to not receive the ...GOT_IP event. On others it kept getting "no LaPique found reconnect after 1s" when the SSID was indeed present and functional
(thhe only half-decent explanation I could find for that was https://bbs.espressif.com/viewtopic.php?t=10173)
Other times it was:
bcn_timout,ap_probe_send_start
ap_probe_send over, rest wifi status to disassoc
I found #5083 and set WiFi.mode(WIFI_NONE_SLEEP) but it made no difference.
Thus although the exact behaviour is non-deterministic, the one thing it WILL NOT do is make a valid connection. This happens with both sdk 2.2.1, sdk 2.2.2
It also happens with all variations of lwip v2 (IPV4)
I have tried all permutations of flash erasing, I even found a blank_1M.bin and manually uploaded that to clear the flash...nothing makes any difference, the code simply will no run on SONOFF_BASIC without the debug messages in, and even then its unusable.
My experience tells me this has to be some kind of timing problem - what else can that debug code be changing? But why only apparent on the "basic" are its RF calibration bytes in a different place in RAM?
Needless to say I am at my wits' end..having spent weeks trying to track this down...
The text was updated successfully, but these errors were encountered: