Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable UMM_POISON in debug mode only #1800

Closed
luc-github opened this issue Mar 20, 2016 · 25 comments
Closed

Enable UMM_POISON in debug mode only #1800

luc-github opened this issue Mar 20, 2016 · 25 comments

Comments

@luc-github
Copy link
Contributor

Hi I have a project that work very well on 2.0.0 core - web server is responsive in AP and STA.
When I moved to 2.1.0 display become very slow on STA and so slow in AP that nothing is displaying. my home page ('/') do around 10Ko, I build by part of 1.2K and send part , then build next part and send it to avoid to consume memory and go out of memory - this working very well on 2.0.0.

I saw webserver has been rewritten for 2.1.0- is there any new parameter or setting to fine tune to get same speed as 2.0.0 that I have missed ?

Thanks

@mkeyno
Copy link

mkeyno commented Mar 21, 2016

hi @luc-github , why don't you use the @Links2004 websocket with single connection, or even the ajax, I presume you did that before you declare, you using the TCP/IP only

@luc-github
Copy link
Contributor Author

Because I need a web server to display web pages, this is not same usage as websocket

@luc-github
Copy link
Contributor Author

I should add it is not related to #1661
as I do a WiFi.Disconnect() when I am in AP mode
And display is very slow also on STA mode compare to 2.0.0 for same code and same device

@luc-github
Copy link
Contributor Author

I have tried git version today and same result : very slow - seems 2.1.0 send data but very slowly on my project (https://github.com/luc-github/ESP8266) with my nodemcu 1.0 - when it is fine on 2.0.0

Any idea what changes between 2.0.0 vs 2.1.0 could cause this ?
Also no one experienced same issue on 10k+ pages ?

I have willing to troubleshoot more but no crash, so I do not know what part of code to check
issue happen on big pages but not on small ones - I guess it may be some new timeout but I did not found it yet - so I am stuck on 2.0.0

@mkeyno
Copy link

mkeyno commented Mar 22, 2016

I wonder why every new release miss some feature or previous, I hope @igrr and @Links2004 could finally release the last stable version before the new chip like ESP32 come to market

@TheAustrian
Copy link

If you want a static version of the library why not just use the old 2.0.0 that has what you want and worked for your needs?

@luc-github
Copy link
Contributor Author

@mkeyno this is normal to have some misses and errors - this is a big project and it managed by people on they free time - so full testing is not always possible - regression test can take ages if you do not have enough people to do it - @igrr and others great contributors do a tremendous job, they have to handle issues and new requests permanently - try and you will see.

A way to contribute before asking new feature is to help to check if current ones are ok, giving as much information as possible to reproduce issue, always try to narrow down issue in code to save time to developers - if I can fix by myself, I do and do a PR (unfortunately I did not do a lot....)

@igrr
Copy link
Member

igrr commented Mar 22, 2016

@luc-github could you please share the piece of code which constructs and sends the page?
I looked through the change log and see a couple of changes which might cause this, but not sure yet.

@luc-github
Copy link
Contributor Author

@igrr
yes it is here : https://github.com/luc-github/ESP8266/blob/master/esp8266/webinterface.cpp#L339-L534
I do a dry run to calculate page size to build the header, then rebuild/send parts

Thanks for looking into

@igrr
Copy link
Member

igrr commented Mar 22, 2016

Could that be SPIFFS related slowdown? We have updated to a new version of SPIFFS. Is there a way to rule out that possibility?

@luc-github
Copy link
Contributor Author

OK I will check by building fake 10K page without using SPIFFS , should be easy to do
I keep you posted - thanks

@luc-github
Copy link
Contributor Author

Hum seems the issue - if I do not use SPIFF... display is immediate
Seems new SPIFF has bad performance 😢

@luc-github
Copy link
Contributor Author

I will try to reverse to previous version of SPIFF to double confirm

@luc-github
Copy link
Contributor Author

SPIFFS may not be the only root cause - as reverting to version previous to new SPIFFS show some performance issue - I will see from 2.0.0 which commit start to show the problem and feedback

@igrr
Copy link
Member

igrr commented Mar 22, 2016

We also have a new memory allocator which runs extra checks on every malloc/free. This might also have some impact.

@luc-github
Copy link
Contributor Author

Ok noted - thanks a lot I will check the commit which do this - I was busy latests month and did not followed all changes 😞

@luc-github
Copy link
Contributor Author

this commit bring a big performance issue : https://github.com/esp8266/Arduino/tree/339140c756cea6acc9c0c023477f054c6f9990bb

before it is same as 2.0.0

@igrr
Copy link
Member

igrr commented Mar 23, 2016

Could you please try with this config line commented out?

@luc-github
Copy link
Contributor Author

@igrr thanks for coaching
I did the comment on https://github.com/esp8266/Arduino/tree/339140c756cea6acc9c0c023477f054c6f9990bb and got faster display but also some random failure to display then display, and got one warning for malloc

So I updated to complete 2.1.0 and no more failure and speed is even faster than 2.0.0 (and no warning)

to give idea of performances to display a page in STA mode I can count to :
3 with 2.0.0
16 with 2.1.0
2 with poison commented
( I know it is not precise but it give idea of proportions 😄 )
so I am not so sure SPIFFS has so big impact now

Another effect of commenting : my memory move from 25648 bytes to 29072 bytes

I read UMM_POISON is to detect heap corruption, may be it should be enabled only in DEBUG mode if it is consuming so much resources ?

I will do test on AP mode and report later

Thanks again for this great project

@igrr
Copy link
Member

igrr commented Mar 23, 2016

Yes, I think it would make sense to enable poisoning only in debug mode.

@igrr igrr changed the title Webserver 2.1.0 is very slow compare to 2.0.0 Enable UMM_POISON in debug mode only Mar 23, 2016
@igrr
Copy link
Member

igrr commented Mar 23, 2016

Another option would be introduce an additional menu item, but we have enough of them already I think.

@mkeyno
Copy link

mkeyno commented Mar 23, 2016

dear @igrr should we do the same in our library ? I mean always comment out the mentioned line for every example

@igrr
Copy link
Member

igrr commented Mar 23, 2016

This line is not part of examples, it is part of the core. As a temporary solution you may comment that line out.

@luc-github
Copy link
Contributor Author

Seems no improvement for AP mode still get
ERR_CONNECTION_TIMED_OUT
I will redo regression test again in AP mode and identify which commit bring this and open another issue to keep this thread clean

@luc-github
Copy link
Contributor Author

Finally I won't open new issue, unless someone think it is necessary ,here the results of my test why AP is no responding.
1 - it happen with SDK 1.5 commit not before
2 - it seems linked to initialization sequence (TBC): on 2.0.0 I setup AP then I set static IP, on 2.1.0 it seems need to set static IP then setup AP, I followed same sequence of captive portal sample and it works back now
3 - it seems there are some random issues to use SSDP - MDNS - Captive portal at once in AP mode - no issues on 2.0.0 and no issue if not at once on 2.1.0

So to sum up all can be back if I modify my - so it may not be an issue but a limitation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants