Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HNAP calls on dhcp-server in AP mode causes an exception #7601

Open
ghost opened this issue Sep 16, 2020 · 35 comments
Open

HNAP calls on dhcp-server in AP mode causes an exception #7601

ghost opened this issue Sep 16, 2020 · 35 comments
Assignees
Labels
Milestone

Comments

@ghost
Copy link

ghost commented Sep 16, 2020

Platform

Hardware: ESP8266 (original AI-Thinker ESP12-F)
Core Version: 2.7.4
Development Env: Arduino IDE 1.8.13
Operating System: Mac OS (Catalina)

Settings in IDE

  • Module: Generic ESP8266 Module
  • Flash Mode: qio
  • Flash Size: 4MB/1MB
  • lwip Variant: v2 Lower Memory
  • Reset Method: nodemcu
  • Flash Frequency: 40Mhz
  • CPU Frequency: 160MHz
  • Upload Using: SERIAL
  • Upload Speed: 460800

Problem Description

Hi everybody.

Exception (3) occurs on the standard example every time when Safari makes sudden HNAP calls(Home Network Administration Protocol) on the server. These HTTP calls made by Safari are a little bit hectic and sometimes we need to wait several minutes before receiving them. Chrome doesn't seem to send these kind a HTTP calls.

Thanks in advance for the help.

Sketch

See standard example : ESP8266WiFi/examples/WiFiAccessPoint/WiFiAccessPoint.ino

Debug Messages

17:26:33.740 -> SDK:2.2.1(cfd48f3)/Core:2.7.3-3-g2843a5ac=20703003/lwIP:STABLE-2_1_2_RELEASE/glue:1.2-30-g92add50/BearSSL:5c771be
17:26:33.740 ->
17:26:33.740 -> Configuring access point...bcn 0
17:26:33.852 -> del if1
17:26:33.852 -> usl
17:26:33.852 -> add if1
17:26:33.852 -> dhcp server start:(ip:192.168.4.1,mask:255.255.255.0,gw:192.168.4.1)
17:26:33.852 -> bcn 100
17:26:33.852 -> AP IP address: 192.168.4.1
17:26:33.852 -> HTTP server started
17:27:59.594 -> add 1
17:27:59.628 -> aid 1
17:27:59.628 -> station: 3c:22:fb:3f:d0:0f join, AID = 1
17:28:41.619 -> New client
17:28:41.619 -> request: GET /this/is/a/test/ HTTP/1.1
17:28:41.619 -> method: GET url: /this/is/a/test/ search:
17:28:41.619 -> headerName: Host
17:28:41.619 -> headerValue: 192.168.4.1
17:28:41.619 -> headerName: Upgrade-Insecure-Requests
17:28:41.619 -> headerValue: 1
17:28:41.619 -> headerName: Accept
17:28:41.619 -> headerValue: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
17:28:41.619 -> headerName: User-Agent
17:28:41.619 -> headerValue: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.2 Safari/605.1.15
17:28:41.654 -> headerName: Accept-Language
17:28:41.654 -> headerValue: fr-fr
17:28:41.654 -> headerName: Accept-Encoding
17:28:41.654 -> headerValue: gzip, deflate
17:28:41.654 -> headerName: Connection
17:28:41.654 -> headerValue: keep-alive
17:28:41.654 -> args:
17:28:41.654 -> args count: 0
17:28:41.654 -> args:
17:28:41.654 -> args count: 0
17:28:41.654 -> Request: /this/is/a/test/
17:28:41.654 -> Arguments:
17:28:41.654 -> final list of key/value pairs:
17:28:41.654 -> request handler not found
17:28:41.724 -> New client
17:28:41.724 -> request: GET /favicon.ico HTTP/1.1
17:28:41.724 -> method: GET url: /favicon.ico search:
17:28:41.757 -> headerName: Host
17:28:41.757 -> headerValue: 192.168.4.1
17:28:41.757 -> headerName: Connection
17:28:41.757 -> headerValue: keep-alive
17:28:41.757 -> headerName: Accept
17:28:41.757 -> headerValue: /
17:28:41.757 -> headerName: User-Agent
17:28:41.757 -> headerValue: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.2 Safari/605.1.15
17:28:41.757 -> headerName: Accept-Language
17:28:41.757 -> headerValue: fr-fr
17:28:41.757 -> headerName: Referer
17:28:41.795 -> headerValue: http://192.168.4.1/this/is/a/test/
17:28:41.795 -> headerName: Accept-Encoding
17:28:41.795 -> headerValue: gzip, deflate
17:28:41.795 -> args:
17:28:41.795 -> args count: 0
17:28:41.795 -> args:
17:28:41.795 -> args count: 0
17:28:41.795 -> Request: /favicon.ico
17:28:41.795 -> Arguments:
17:28:41.795 -> final list of key/value pairs:
17:28:41.795 -> request handler not found
17:28:50.713 -> New client
17:28:50.713 -> request: GET / HTTP/1.1
17:28:50.713 -> method: GET url: / search:
17:28:50.713 -> headerName: Host
17:28:50.713 -> headerValue: 192.168.4.1
17:28:50.713 -> headerName: Accept
17:28:50.713 -> headerValue: /
17:28:50.713 -> headerName: Accept-Encoding
17:28:50.713 -> headerValue: deflate, gzip
17:28:50.713 -> args:
17:28:50.713 -> args count: 0
17:28:50.713 -> args:
17:28:50.713 -> args count: 0
17:28:50.713 -> Request: /
17:28:50.713 -> Arguments:
17:28:50.713 -> final list of key/value pairs:
17:28:50.713 -> New client
17:28:50.747 -> request: GET /HNAP1/ HTTP/1.1
17:28:50.747 -> method: GET url: /HNAP1/ search:
17:28:50.747 -> headerName: Host
17:28:50.747 -> headerValue: 192.168.4.1
17:28:50.747 -> headerName: Accept
17:28:50.747 -> headerValue: /
17:28:50.747 -> headerName: Accept-Encoding
17:28:50.747 -> headerValue: deflate, gzip
17:28:50.747 -> headerName: Connection
17:28:50.747 -> headerValue: Close
17:28:50.747 -> args:
17:28:50.747 -> args count: 0
17:28:50.747 -> args:
17:28:50.747 -> args count: 0
17:28:50.747 -> Request: /HNAP1/
17:28:50.747 -> Arguments:
17:28:50.747 -> final list of key/value pairs:
17:28:50.747 -> request handler not found
17:29:08.745 -> New client
17:29:08.745 -> request: GET / HTTP/1.1
17:29:08.745 -> method: GET url: / search:
17:29:08.745 -> headerName: Host
17:29:08.745 -> headerValue: 192.168.4.1
17:29:08.745 -> headerName: Accept
17:29:08.780 -> headerValue: /
17:29:08.780 -> headerName: Accept-Encoding
17:29:08.780 -> headerValue: deflate, gzip
17:29:08.780 -> args:
17:29:08.780 -> args count: 0
17:29:08.780 -> args:
17:29:08.780 -> args count: 0
17:29:08.780 -> Request: /
17:29:08.780 -> Arguments:
17:29:08.780 -> final list of key/value pairs:
17:29:08.780 -> New client
17:29:08.780 -> request: GET /HNAP1/ HTTP/1.1
17:29:08.780 -> method: GET url: /HNAP1/ search:
17:29:08.780 -> headerName: Host
17:29:08.780 -> headerValue: 192.168.4.1
17:29:08.780 -> headerName: Accept
17:29:08.780 -> headerValue: /
17:29:08.780 -> headerName: Accept-Encoding
17:29:08.813 -> headerValue: deflate, gzip
17:29:08.813 -> headerName: Connection
17:29:08.813 -> headerValue: Close
17:29:08.813 -> args:
17:29:08.813 -> args count: 0
17:29:08.813 -> args:
17:29:08.813 -> args count: 0
17:29:08.813 -> Request: /HNAP1/
17:29:08.813 -> Arguments:
17:29:08.813 -> final list of key/value pairs:
17:29:08.813 -> request handler not found
17:29:10.932 -> Fatal exception 3(LoadStoreErrorCause):
17:29:10.932 -> epc1=0x4000deea, epc2=0x00000000, epc3=0x00000000, excvaddr=0x4023ef68, depc=0x00000000
17:29:10.932 ->
17:29:10.932 -> --------------- CUT HERE FOR EXCEPTION DECODER ---------------
17:29:12.197 -> --------------- CUT HERE FOR EXCEPTION DECODER ---------------
17:29:12.197 ->
17:29:12.197 -> ets Jan 8 2013,rst cause:2, boot mode:(3,0)
17:29:12.197 ->
17:29:12.231 -> load 0x4010f000, len 3584, room 16
17:29:12.231 -> tail 0
17:29:12.231 -> chksum 0xb0
17:29:12.231 -> csum 0xb0
17:29:12.231 -> v2843a5ac
17:29:12.231 -> ~ld
17:29:13.269 ->

Exception Decoder

Exception 3: LoadStoreError: Processor internal physical address or data error during load or store
PC: 0x4000deea
EXCVADDR: 0x4023ef68

Decoding stack results
0x4010028c: calloc(size_t, size_t) at /Users/admin/Library/Arduino15/packages/esp8266/hardware/esp8266/2.7.4/cores/esp8266/heap.cpp line 242
0x402300f0: handle_dhcp at glue-lwip/esp-dhcpserver.c line 626
0x402300b0: handle_dhcp at glue-lwip/esp-dhcpserver.c line 679
0x401008cd: check_poison_neighbors(uint16_t) at /Users/admin/Library/Arduino15/packages/esp8266/hardware/esp8266/2.7.4/cores/esp8266/umm_malloc/umm_local.c line 71
0x40100989: umm_malloc_core(size_t) at /Users/admin/Library/Arduino15/packages/esp8266/hardware/esp8266/2.7.4/cores/esp8266/umm_malloc/umm_malloc.cpp line 458
0x40213c40: udp_input at core/udp.c line 404
0x4022f7cc: pbuf_alloc at glue-esp/lwip-esp.c line 669
0x40218a70: ip4_input at core/ipv4/ip4.c line 1461
...

@mhightower83
Copy link
Contributor

Try your test running with Flash set to DIO instead of QIO.

I have several of these boards and they are only stable in DIO mode, in contrast, the Ai Thinker's specification for the ESP-12-F has a photo showing QIO. Check the back of your module's PCB Antenna area for silkscreen markings, mine is marked with ESP-12-F DIO L4. Ai Thinker's photo shows ESP-12-F QIO L4. As far as I can see, Ai Thinker specification's text does not indicate which mode to use.

@ghost
Copy link
Author

ghost commented Sep 16, 2020

Thanks for your reply.

Unfortunately, I did the test and all configurations failed : DOUT, DIO, QOUT and QIO.
My ESP-12F is marked "ESP-12-F L4". No QIO or DIO.

I'll continue to investigate with WireShark. It also fails in mode WIFI_AP_STA.

@d-a-v
Copy link
Collaborator

d-a-v commented Sep 16, 2020

Then HTTP captured protocol indeed will hopefully help.

We also have now an internal packet capture facility, check the provided Netdump library (git version, post 2.7.4).

@Jason2866
Copy link
Contributor

Jason2866 commented Sep 17, 2020

From which manufactor is the flash chip?
I have modules with PUYA chips. Some work, some not.
The issues i have with the bad ones are weird.
It took me a long time until i recognized everything is okay it is "just" the flash chip
Edit:
The upload speed you use is way too high for many modules. Better use 115200

@ghost
Copy link
Author

ghost commented Sep 17, 2020

This is ESP-12-F bought directly to AI-Thinker.
I'll try to upload slower and see what happens.

I'm currently adding the packet capture to the code to feed the discussion.

@ghost
Copy link
Author

ghost commented Sep 17, 2020

Uploading at lower speed did not affect the result.
See logs attached.

Logs_HNAP.txt

Thanks.

@Jason2866
Copy link
Contributor

Still not answered which flash chip is used in the module. From whom build or bought is not a info from interest.

@ghost
Copy link
Author

ghost commented Sep 18, 2020

This is a XMC - QH32BHIG - 4MB flash chip.

@devyte
Copy link
Collaborator

devyte commented Sep 18, 2020

cc @ChocolateFrogsNuts @Tech-TX

@mhightower83
Copy link
Contributor

Since you are buying bare AI-Thinker modules (not a DEV board), what do you use for a power source?

On new designs, I like to rule out possible power supply issues, before going crazy with a problem. Power issues can create a lot of different and unexpected symptoms that are difficult to associate with power. Part of the issue with the ESP8266 is it can go from sipping 60ma to gulping 300ma in an instant. That is a 400% load increase. A switch-mode power supply usually doesn't have that kind of Transient Load response. All of the IoT designs that I have seen, pass the switch-mode power supply output through an LDO regulator to feed the ESP8266.

@ghost
Copy link
Author

ghost commented Sep 19, 2020

I use a classic LDO regulator to power the module. The source can draw easily up to 1000 mA. I've already checked if a voltage drop can cause this issue but did not find one.

Monday, I'll bypass the LDO and use my laboratory power supply to see if it helps. In the meantime, I'll continue my investigations and keep you informed.

Thanks for the help.

@Jason2866
Copy link
Contributor

My bet is the XMC flash chip as trouble maker.

@ghost
Copy link
Author

ghost commented Sep 21, 2020

Still the same issue with power bypassed by a laboratory power supply. Tested on 10 modules (same batch coming from AI-Thinker). No glitches, all good on power supply.

Now, I'll remove the flash chip on some modules and compare. Any suggestion for a good flash ?
Thanks.

@Jason2866
Copy link
Contributor

Never had issues with winbond flash chips

@ghost
Copy link
Author

ghost commented Sep 21, 2020

OK, let's go with that. I'll keep you posted after the change.

In the meantime, I noticed that the problem happens every time on a DHCP incoming call after the HNAP calls.
It happens also on all the 10 modules. Whatever the flash mode and frequency.

@Jason2866
Copy link
Contributor

We will know more after you did swap the flash.

@devyte devyte added the waiting for feedback Waiting on additional info. If it's not received, the issue may be closed. label Sep 22, 2020
@ghost
Copy link
Author

ghost commented Sep 26, 2020

I changed the flash chip but it we still have the same issue. after looking at a lot of crash logs, it always crashes at the same line in esp-dhcpserver.c (line 626).

A newer version of Safari was also released but it did not change this behavior.

Fortunately, no crashes with Chrome on MacOS because it does not implement HNAP.

I try now to make HNAP calls from another "more controlled" source than Safari to try some debugging.

@d-a-v
Copy link
Collaborator

d-a-v commented Sep 26, 2020

I understand that running standard example : ESP8266WiFi/examples/WiFiAccessPoint/WiFiAccessPoint.ino and accessing the esp using current macOS's Safari is enough to try and reproduce this issue.
Am I right ?

@ghost
Copy link
Author

ghost commented Sep 26, 2020

Correct.

Every time with this configuration. Sometimes it takes several minutes to get the HNAP calls, but they will arrive and crash the module.

@d-a-v
Copy link
Collaborator

d-a-v commented Sep 26, 2020

To restrict traffic to AP and have packet capture, I added in the example

#include <Netdump.h>
NetCapture::Netdump nd;
...
  // in setup():
  WiFi.mode(WIFI_AP);
  nd.printDump(Serial, NetCapture::Packet::PacketDetail::FULL, [](NetCapture::Packet p) { return true; });

Safari on iPad macOS 12.4.8 did not trigger this bug after several minutes.
I will try with an apple laptop.

@ghost
Copy link
Author

ghost commented Sep 26, 2020

I try and keep you informed.

No issue found on iOS (iPhone & iPad). Probably a different implementation of Safari.
I have a Mac Book Pro (2020) / Catalina (Version 10.156) / Safari (Version 14.0 - 15610.1.28.1.9, 15610).

@ghost
Copy link
Author

ghost commented Sep 26, 2020

Same crash...
See logs attached.

logs.txt

@d-a-v
Copy link
Collaborator

d-a-v commented Sep 27, 2020

I'm sorry to not be able to reproduce, using a macmini, updated to latest macOS and safari.

That's 580 seconds after esp boot:

request: GET / HTTP/1.1
method: GET url: / search:  keepAlive=: 1
headerName: Host
headerValue: 192.168.4.1
headerName: Upgrade-Insecure-Requests
headerValue: 1
headerName: Accept
headerValue: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
headerName: User-Agent
headerValue: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Safari/605.1.15
headerName: Accept-Language
headerValue: fr-fr
headerName: Accept-Encoding
headerValue: gzip, deflate
headerName: Connection
headerValue: keep-alive

Would there be another way to reproduce the HNAP request than by using safari ?

@d-a-v
Copy link
Collaborator

d-a-v commented Sep 27, 2020

dhcp server is brought to main repository in PR #6680.
Would you be able to test with this PR (that I just updated) ? You need first to use the git version of this core.
Or you can install the alpha release v0.0.2 with the arduino IDE, that already includes this PR.

@ghost
Copy link
Author

ghost commented Sep 28, 2020

Sometimes, it takes about 15 minutes to get the HNAP calls. I guess there is a cache somewhere.
The best solution I found to lower this time is to change the AP SSID name.

Unfortunately, I did not find tools that can emulates the HNAP calls.

I'll try this today and will give you an update.

@ghost
Copy link
Author

ghost commented Sep 28, 2020

Usually, when I change the SSID name on the module and cancel the WiFi configuration on my Mac, it occurs in less than a minute...

@ghost
Copy link
Author

ghost commented Sep 30, 2020

Same issue with the alpha release v0.0.2.

See logs attached.
logs.txt

I'll uninstall Avast Antivirus now because it looks like a security analysis of the access point.

@ghost
Copy link
Author

ghost commented Sep 30, 2020

Avast disabled. Sounds promising because no HNAP calls during these last 15 minutes...
I'll continue my tests to validate this.

@ghost
Copy link
Author

ghost commented Oct 2, 2020

Confirmed : HNAP calls are an Avast security analysis of the access point through Safari.

@ghost ghost closed this as completed Oct 15, 2020
@d-a-v
Copy link
Collaborator

d-a-v commented Oct 15, 2020

That should not be a reason to blow up the ESP !

Leaving this opened. Once dhcp server is in the main repository (that's #6680) and its code is revisited,
one may check this bug again.
In the meantime one / I should try to artificially extract and replay the request based on your above logs @Exocet22 .

@d-a-v d-a-v reopened this Oct 15, 2020
@d-a-v d-a-v self-assigned this Oct 15, 2020
@d-a-v d-a-v added component: network type: bug and removed waiting for feedback Waiting on additional info. If it's not received, the issue may be closed. labels Oct 15, 2020
@ghost
Copy link
Author

ghost commented Oct 15, 2020

Sorry !

@d-a-v d-a-v added this to the 3.0.0 milestone Oct 27, 2020
@ghost
Copy link
Author

ghost commented Oct 28, 2020

I continued the tests : the problem does not occur when the module is in STA mode. It occurs only in AP mode.

@d-a-v
Copy link
Collaborator

d-a-v commented Oct 28, 2020

dhcp server is indeed enabled only in AP mode

@ghost
Copy link
Author

ghost commented Oct 28, 2020

Correct ! It's the reason why it does not crash !

@d-a-v d-a-v changed the title Exception (3) on standard example WiFiAccessPoint HNAP calls on dhcp-server in AP mode causes an exception Feb 28, 2021
@d-a-v d-a-v added the help wanted Help needed from the community label Feb 28, 2021
@d-a-v
Copy link
Collaborator

d-a-v commented Feb 28, 2021

Title is renamed, and milestone pushed.
Help is requested reproduce HNAP calls.

@d-a-v d-a-v modified the milestones: 3.0.0, 4.0.0 Feb 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants