Skip to content
This repository was archived by the owner on Feb 4, 2023. It is now read-only.
This repository was archived by the owner on Feb 4, 2023. It is now read-only.

Captive Portal hanging depending on active core for AsyncTCP #100

@ZongyiYang

Description

@ZongyiYang

So I've encountered a weird issue where the AP does not redirect me to the captive portal. It seems to just hang.

I've sort of narrowed down why this happens, and it seems to be dependent on which core AsyncTCP.h launches on with xTaskCreateUniversal.

To reproduce the issue, I have created the following sample, uploaded to a ESP32Cam module.
My Espressif version is 2.0.4. I am using the latest AsyncTCP library with the recommended modified AsyncWebServer library as indicated in the readme for AsyncWifiManager.

#include "Arduino.h"
#include <ESPAsync_WiFiManager.h>

// default IP addresses
IPAddress APStaticIP  = IPAddress(192, 168, 100, 1);
IPAddress APStaticGW  = IPAddress(192, 168, 100, 1);
IPAddress APStaticSN  = IPAddress(255, 255, 255, 0);

IPAddress stationIP   = IPAddress(192, 168, 1, 232);
IPAddress gatewayIP   = IPAddress(192, 168, 1, 1);
IPAddress netMask     = IPAddress(255, 255, 255, 0);

IPAddress dns1IP      = gatewayIP;
IPAddress dns2IP      = IPAddress(8, 8, 8, 8);
    
void initAPIPConfigStruct(WiFi_AP_IPConfig &in_WM_AP_IPconfig)
{
  in_WM_AP_IPconfig._ap_static_ip   = APStaticIP;
  in_WM_AP_IPconfig._ap_static_gw   = APStaticGW;
  in_WM_AP_IPconfig._ap_static_sn   = APStaticSN;
}

void initSTAIPConfigStruct(WiFi_STA_IPConfig &in_WM_STA_IPconfig)
{
  in_WM_STA_IPconfig._sta_static_ip   = stationIP;
  in_WM_STA_IPconfig._sta_static_gw   = gatewayIP;
  in_WM_STA_IPconfig._sta_static_sn   = netMask;
  in_WM_STA_IPconfig._sta_static_dns1 = dns1IP;
  in_WM_STA_IPconfig._sta_static_dns2 = dns2IP;
}
    
void startApPortal()
{
  // default IP values
  WiFi_AP_IPConfig  WM_AP_IPconfig;
  WiFi_STA_IPConfig WM_STA_IPconfig;
  initAPIPConfigStruct(WM_AP_IPconfig);
  initSTAIPConfigStruct(WM_STA_IPconfig);

  // construct ESPAsync_wifiManager object
  DNSServer dnsServer;
  AsyncWebServer server(80);
  ESPAsync_WiFiManager ESPAsync_wifiManager(&server, &dnsServer, "AsyncESP32-FSWebServer");
  ESPAsync_wifiManager.setAPStaticIPConfig(WM_AP_IPconfig);
  ESPAsync_wifiManager.setMinimumSignalQuality(-1);
  ESPAsync_wifiManager.setConfigPortalChannel(0);
  ESPAsync_wifiManager.setSTAStaticIPConfig(WM_STA_IPconfig);
  // AP ssid and password
  String apSsid = "ESP_" + String((uint32_t)ESP.getEfuseMac(), HEX);
  const char* apPass = "12345678";
  
  Serial.println("Starting access point on SSID: " + apSsid);
  // Start AP 
  if (!ESPAsync_wifiManager.startConfigPortal(apSsid.c_str(), apPass))
  {
    Serial.println(F("Not connected to WiFi but continuing anyway."));
  }
  else
  {
    Serial.println(F("AP connection successful."));
    Serial.println("  Connected with ip: " + WiFi.localIP().toString());
  }
}
void setup() {
  Serial.begin(115200);
  while (!Serial) {}

  Serial.println("--start--");
  WiFi.mode(WIFI_STA);
  WiFi.begin("", "");
  vTaskDelay(4000 / portTICK_PERIOD_MS);
  WiFi.disconnect(); 
  vTaskDelay(4000 / portTICK_PERIOD_MS);

  // this semaphore call is completely valid, but causes some bug in captive portal
  // this only bugs out if tcp messages are sent on core 1
  // ie: #define CONFIG_ASYNC_TCP_RUNNING_CORE 1 placed in line 34 of AsyncTCP.h
  // the issue goes away if the running core is defined as 0
  // without modifications to AsyncTCP.h, the bug is random since AsyncTCP randomly
  // picks the running core by default
  SemaphoreHandle_t sem = xSemaphoreCreateMutex();
  if (sem != NULL)
    xSemaphoreTake(sem, portMAX_DELAY);
  
  startApPortal();

  if (sem != NULL)
    vSemaphoreDelete(sem);
}

void loop() {
}

Now in AsyncTCP.h, I have added the following code in line 34:

// START ADDITIONAL CODE------
#define CONFIG_ASYNC_TCP_RUNNING_CORE 1
// END ADDITIONAL CODE------

#ifndef CONFIG_ASYNC_TCP_RUNNING_CORE
#define CONFIG_ASYNC_TCP_RUNNING_CORE -1 //any available core
#define CONFIG_ASYNC_TCP_USE_WDT 1 //if enabled, adds between 33us and 200us per event
#endif

The expected behavior is that on running, a AP is created that directs a user to a captive portal on joining the AP. Note that there is some semaphore code around the startApPortal function call. This technically does nothing but seems to be important in triggering the bug. Perhaps it is because the Mutex is messing with task priorities? I don't think it should be. It could also just be timing related.

To force the bug, if CONFIG_ASYNC_TCP_RUNNING_CORE in AsyncTCP.h is set to 1, the bug happens and the captive portal hangs. If CONFIG_ASYNC_TCP_RUNNING_CORE is set to 0, the bug does not happen and the program functions as expected. However, during default AsyncTCP.h operation, it randomly chooses which core to run TCP calls on. If it lands on the wrong core due to load the captive portal might hang.

In a more general use case, this bug causes a problem when attempting to use this library in some background loop on a core.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions