Skip to content

[Bug] [Ethos] apps/microtvm/ethosu/run_demo.sh gets wrong inference result #15643

@toyowata

Description

@toyowata

A run_demo.sh script in the apps/microtvm/ethosu is able to build microtvm executable binary for Ethos-U and kick to FVP to run the demo, but the inference result is wrong.

i.e. A cat image is classified as goldfinch.

Expected behavior

The inference result is 'tabby'.

Actual behavior

The inference result is 'goldfinch'.

FVP output:

telnetterminal0: Listening for serial connection on port 5000
telnetterminal1: Listening for serial connection on port 5001
telnetterminal2: Listening for serial connection on port 5002
telnetterminal5: Listening for serial connection on port 5003

    Ethos-U rev fa6e5f88 --- Aug  9 2021 09:54:41
    (C) COPYRIGHT 2019-2021 Arm Limited
    ALL RIGHTS RESERVED

Starting Demo
I: Initializing NPU: base_address=0x48102000, fast_memory=0x0, fast_memory_size=0, secure=1, privileged=1
I: Soft reset NPU
I: New NPU driver registered (handle: 0x0x20000de0, NPU: 0x0x48102000)
Running inference
I: Acquiring NPU driver handle
D: ethosu_reserve_driver(): NPU driver handle 0x20000de0 reserved
D: ethosu_invoke_async(): OPTIMIZER_CONFIG
I: Optimizer release nbr: 0 patch: 1
I: Optimizer config. product=0, cmd_stream_version=0, macs_per_cc=8, shram_size=48, custom_dma=0
I: Optimizer config. arch version: 1.0.6
I: Ethos-U config. product=0, cmd_stream_version=0, macs_per_cc=8, shram_size=48, custom_dma=1
I: Ethos-U. arch version=1.1.0
D: ethosu_invoke_async(): NOP
D: ethosu_invoke_async(): NOP
D: ethosu_invoke_async(): NOP
D: ethosu_invoke_async(): COMMAND_STREAM
I: handle_command_stream: cmd_stream=0x60025be0, cms_length 2663
I: Soft reset NPU
D: ethosu_dev_set_clock_and_power(): CMD=0x00000000
D: ethosu_dev_run_command_stream(): QBASE=0x0000000060025be0, QSIZE=10652, cmd_stream_ptr=0x60025be0
D: ethosu_dev_run_command_stream(): BASEP0=0x0000000060028580
D: ethosu_dev_run_command_stream(): BASEP1=0x0000000021000000
D: ethosu_dev_run_command_stream(): BASEP2=0x0000000000000000
D: ethosu_dev_run_command_stream(): BASEP3=0x0000000060000fb0
D: ethosu_dev_run_command_stream(): BASEP4=0x00000000200009d0
D: ethosu_dev_run_command_stream(): BASEP5=0x0000000000000000
D: ethosuD: ethosu_irq_handler(): Got interrupt from Ethos-U
_irq_handler(): Got interrupt from Ethos-UD: ethosu_dev_set_clock_and_power(): CMD=0x0000000c
D: ethosu_wait(): Inference finished successfully...
D: ethosu_release_driver(): NPU driver handle 0x20000de0 released
The image has been classified as 'goldfinch'

Environment

OS : Ubuntu 22.04.2 LTS
tvm version : v0.13.0
tvmc version : 0.13.dev295

Steps to reproduce

# create docker image
$ docker pull tlcpack/ci-cortexm:20230710-060128-a60cd0fec
$ docker images
$ docker run -it tlcpack/ci-cortexm:20230710-060128-a60cd0fec

# clone the tvm repo
git clone -b v0.13.0 https://github.com/apache/tvm
cd ./tvm/apps/microtvm/ethosu

# install python modules
pip install apache-tvm==0.13.dev295
pip uninstall tensorflow flatbuffers protobuf
pip uninstall tensorboard onnxruntime onnx
pip install -r requirements.txt

# run demo
./run_demo.sh

Triage

vert:micro

Root cause

An output_data_sec section is not explicitly defined in the lnker script file and it is mapped at DTCM area according to the map file.

.data           0x0000000020000000      0x9c4 load address 0x0000000000010134
                0x0000000020000000                __data_start__ = .

(snip)

 *(.jcr*)
                0x00000000200009c4                . = ALIGN (0x4)
                0x00000000200009c4                __data_end__ = .

.igot.plt       0x00000000200009c4        0x0 load address 0x0000000000010af8
 .igot.plt      0x00000000200009c4        0x0 /opt/arm/gcc-arm-none-eabi/bin/../lib/gcc/arm-none-eabi/10.2.1/thumb/v8-m.main+dp/hard/crtbegin.o
output_data_sec
                0x00000000200009d0      0x3ea load address 0x0000000000010b10
 output_data_sec
                0x00000000200009d0      0x3ea /tmp/ccInXluG.o
                0x00000000200009d0                output

The DTCM area cannot be accessed by NPU.

/*------------------ Reference System Memories -------------
  +===================+============+=======+============+============+
  | Memory            | Address    | Size  | CPU Access | NPU Access |
  +===================+============+=======+============+============+
  | ITCM              | 0x00000000 | 512KB | Yes (RO)   | No         |
  +-------------------+------------+-------+------------+------------+
  | DTCM              | 0x20000000 | 512KB | Yes (R/W)  | No         |
  +-------------------+------------+-------+------------+------------+
  | SSE-300 SRAM      | 0x21000000 |   2MB | Yes (R/W)  | Yes (R/W)  |
  +-------------------+------------+-------+------------+------------+
  | Data SRAM         | 0x01000000 |   2MB | Yes (R/W)  | Yes (R/W)  |
  +-------------------+------------+-------+------------+------------+
  | DDR               | 0x60000000 |  32MB | Yes (R/W)  | Yes (R/W)  |
  +-------------------+------------+-------+------------+------------+ */

Workaround

By adding the output_data_sec section in linker script file, the problem is fixed.

diff --git a/apps/microtvm/ethosu/corstone300.ld b/apps/microtvm/ethosu/corstone300.ld
index fb670d45c..d073ea329 100644
--- a/apps/microtvm/ethosu/corstone300.ld
+++ b/apps/microtvm/ethosu/corstone300.ld
@@ -138,6 +138,7 @@ SECTIONS
   {
     . = ALIGN(16);
     *(ethosu_scratch)
+    *(output_data_sec)
     . = ALIGN (16);
     *(.rodata.tvm)
     . = ALIGN (16);

FVP output:

telnetterminal0: Listening for serial connection on port 5000
telnetterminal1: Listening for serial connection on port 5001
telnetterminal2: Listening for serial connection on port 5002
telnetterminal5: Listening for serial connection on port 5003

    Ethos-U rev fa6e5f88 --- Aug  9 2021 09:54:41
    (C) COPYRIGHT 2019-2021 Arm Limited
    ALL RIGHTS RESERVED

Starting Demo
I: Initializing NPU: base_address=0x48102000, fast_memory=0x0, fast_memory_size=0, secure=1, privileged=1
I: Soft reset NPU
I: New NPU driver registered (handle: 0x0x200009e8, NPU: 0x0x48102000)
Running inference
I: Acquiring NPU driver handle
D: ethosu_reserve_driver(): NPU driver handle 0x200009e8 reserved
D: ethosu_invoke_async(): OPTIMIZER_CONFIG
I: Optimizer release nbr: 0 patch: 1
I: Optimizer config. product=0, cmd_stream_version=0, macs_per_cc=8, shram_size=48, custom_dma=0
I: Optimizer config. arch version: 1.0.6
I: Ethos-U config. product=0, cmd_stream_version=0, macs_per_cc=8, shram_size=48, custom_dma=1
I: Ethos-U. arch version=1.1.0
D: ethosu_invoke_async(): NOP
D: ethosu_invoke_async(): NOP
D: ethosu_invoke_async(): NOP
D: ethosu_invoke_async(): COMMAND_STREAM
I: handle_command_stream: cmd_stream=0x60025fd0, cms_length 2663
I: Soft reset NPU
D: ethosu_dev_set_clock_and_power(): CMD=0x00000000
D: ethosu_dev_run_command_stream(): QBASE=0x0000000060025fd0, QSIZE=10652, cmd_stream_ptr=0x60025fd0
D: ethosu_dev_run_command_stream(): BASEP0=0x0000000060028970
D: ethosu_dev_run_command_stream(): BASEP1=0x0000000021000000
D: ethosu_dev_run_command_stream(): BASEP2=0x0000000000000000
D: ethosu_dev_run_command_stream(): BASEP3=0x0000000060000fb0
D: ethosu_dev_run_command_stream(): BASEP4=0x0000000060025bc0
D: ethosu_dev_run_command_stream(): BASEP5=0x0000000000000000
D: ethosuD: ethosu_irq_handler(): Got interrupt from Ethos-U
_irq_handler(): Got interrupt from Ethos-UD: ethosu_dev_set_clock_and_power(): CMD=0x0000000c
D: ethosu_wait(): Inference finished successfully...
D: ethosu_release_driver(): NPU driver handle 0x200009e8 released
The image has been classified as 'tabby'

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triagePRs or issues that need to be investigated by maintainers to find the right assignees to address ittype: bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions