Skip to content

USB serial jtag occasionally skipping characters if console task and USB_serial_jtag interrupt running on different core (IDFGH-12988) #13939

Closed
@congyue

Description

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

General issue report

Hi Espressif team,

I have searched and read extensively about past issues on USB serial JTAG, (like #12518, #12628, #12605, #9318), most of them have been fixed in 5.2.2 but none of them directly related to this one.

First of all, here is my setup:

  • HW: ESP32-S3-DevKitM-1
  • IDF_version: 5.2.2 tag
  • Set CONFIG_ESP_CONSOLE_USB_SERIAL_JTAG=y in sdkconfig.default

I have prepared sample code to reproduce the issue:

#include <stdio.h>
#include <unistd.h>

#include "driver/usb_serial_jtag.h"
#include "esp_check.h"
#include "esp_log.h"
#include "esp_vfs_dev.h"
#include "esp_vfs_usb_serial_jtag.h"
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "sdkconfig.h"

#define BUF_SIZE (512)
#define ECHO_TASK_STACK_SIZE (4096)

static void init_console(void)
{
    /* Minicom, screen, idf_monitor send CR when ENTER key is pressed */
    esp_vfs_dev_usb_serial_jtag_set_rx_line_endings(ESP_LINE_ENDINGS_CR);
    /* Move the caret to the beginning of the next line on '\n' */
    esp_vfs_dev_usb_serial_jtag_set_tx_line_endings(ESP_LINE_ENDINGS_CRLF);

    // Configure USB SERIAL JTAG
    usb_serial_jtag_driver_config_t usb_serial_jtag_config =
        USB_SERIAL_JTAG_DRIVER_CONFIG_DEFAULT();

    /* Install USB-SERIAL-JTAG driver for interrupt-driven reads and writes */
    usb_serial_jtag_driver_install(&usb_serial_jtag_config);

    /* Tell vfs to use usb-serial-jtag driver */
    esp_vfs_usb_serial_jtag_use_driver();

    ESP_LOGI("usb_serial_jtag echo", "USB_SERIAL_JTAG init done");
}

static void echo_task(void *arg)
{
    char *prompt = "# ";
    write(fileno(stdout), prompt, 2);
    fsync(fileno(stdout));

    int i = 0;
    char cmd[BUF_SIZE] = {0};
    while (1)
    {
        // Read and echo input back to console in real time
        char c;
        read(fileno(stdin), &c, 1);
        write(fileno(stdout), (const char *)&c, 1);

        // If newline or buffer full, process the buffer
        if ((c == '\n') || i == BUF_SIZE - 1)
        {
            if (i > 1)
            {
                // If command is valid, print simulated output
                char output[BUF_SIZE + 16] = {0};
                snprintf(output, BUF_SIZE + 16, "Run cmd: [%s] ...\n", cmd);
                write(fileno(stdout), output, strlen(output));
            }

            write(fileno(stdout), prompt, 2);
            fsync(fileno(stdout));

            memset(cmd, 0, BUF_SIZE);
            i = 0;
        }
        // Otherwise add to command buffer
        else
        {
            cmd[i] = c;
            i++;
        }
    }
}
void app_main(void)
{
    init_console();
    xTaskCreatePinnedToCore(&echo_task, "USB SERIAL JTAG_echo_task", ECHO_TASK_STACK_SIZE, NULL, 2,
                            NULL, APP_CPU_NUM);
}

It can be considered as a refactor of the usb_serial_jtag_echo project to use read()/write() VFS interface instead of the driver API.

Expected behavior

The firmware suppose to simulate a shell and taking commands from input, then print a simulated output to the screen, e.g.

# version
Run cmd: [version] ...
# help
Run cmd: [help] ...

Most of the time this is fine and the issue is hard to be noticed when typing manually.

To make the issue easier to reproduce, I'm using this host script to stress testing the USB_serial_jtag input/output

while true; do echo "12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890" > /dev/cu.usbmodem114301; sleep .1; done

Actual behavior

With the help of above script, sometimes I can see the output would randomly skip one byte:

# 12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
Run cmd: [12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890] ...
# 12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
Run cmd: [1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789] ...  // --> This line is missing a '0' as the last digit!
# 12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
Run cmd: [12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890] ...
# 12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
Run cmd: [1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123457890] ... // --> This line is missing a '6' in the last sequence

Findings

  • I have added debug code to confirm that when the issue happens, the write() function call is passing the correct string length and content to the stdout (No missing bytes from this level).
  • In my real production shell code, I also noticed that adding more fsync(fileno(stdout)) would make the situation worse

I also noticed that below modification would cause the issue disappear:

  • Create the echo_task on PRO_CPU_NUM instead.
  • Or, move the init_console(); function to the beginning of echo_task()
    Both approaches indicate the code perform better when the writing thread and interrupt both registered on the same core.

This symptom is hard to find and it could lead to hard-to-debug issues on factory provisioning and automation test when relying on the USB port. It would be great if it can be root caused and fixed in future version!

@igrr @mythbuster5

Metadata

Assignees

No one assigned

    Labels

    Resolution: NAIssue resolution is unavailableStatus: DoneIssue is done internally

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions