GitHub - dimtass/stm32f746-tflite-micro-mnist: MNIST inference on STM32F746 using TensorFlow Lite for Microcontrollers

MNIST inference on STM32F746 using TensorFlow Lite for Microcontrollers

In this project you can evaluate the MNIST database or your hand-written digits (using the included jupyter notebook) on the STM32F746. This example is tested on the STM32F7 discovery kit. If you have another one then you need to do the appropriate changes that are needed.

The base project is derived from my CMAKE template for the STM32F7xx here.

Note: This project derived from this blog post here, which is an update of this post here. The whole series starts from here

This repo has the following tags:

v1.14.0: using the tensorflow lite for microcontrollers v1.14.0
v2.1.0: using the tensorflow lite for microcontrollers v2.1.0

Usage

First you need to build and upload the project on the stm32f7. To do that follow the instructions in the build section. After that you can use the jupyter notebook to hand-draw a digit and then upload the digit on the stm32f7 and get the prediction back. Please follow the guide inside the notebook.

In order to run the notebook, you need python3, tensorflow and PySerial. I've used Ubuntu 18.04 and miniconda, but conda is not really needed. In any case it's good to run the following commads on a virtual environment.

Example for conda

conda create -n stm32f7-nn-env python
conda activate stm32f7-nn-env
conda install -c conda-forge numpy
conda install -c conda-forge jupyter
conda install -c conda-forge tensorflow-gpu
jupyter notebook

And then browse to the jupyter_notebook/MNIST-TensorFlow.ipynb and run/use the notebook.

Build

To select the which libraries you want to use you need to provide cmake with the proper options. By default all the options are set to OFF. The supported options are:

USE_CORTEX_NN: If set to ON then the project will build using the DSP/NN libs
USE_HAL_DRIVER: If set to ON enables the HAL Driver library
USE_FREERTOS: If set to ON enables FreeRTOS

If you don't use the docker-build.sh script to build the code then you also need to provide the path of the toolchain to use in the CMAKE_TOOLCHAIN.

You can build 2 different versions of this code. The one is use the default depthwise_conv function and the other is to build the cmsis-nn version. You can select which version to build using the USE_CORTEX_NN cmake option.

To build the binary using the CMSIS-NN and CMSIS-DSP, then you need to run the following command:

CLEANBUILD=true USE_CORTEX_NN=ON SRC=src ./build.sh

Finally, I've added two models, the one is the default model without optimized weights which is located in source/src/inc/model_data_uncompressed.h and it's 2.1MB! The other model is the one with the compressed weights and it's located in source/src/inc/model_data_compressed.h and it's ~614KB. You can select which model to use while building the code with the USE_COMP_MODEL cmake flag like this:

CLEANBUILD=true USE_OVERCLOCK=OFF USE_CMSIS_NN=OFF USE_COMP_MODEL=ON SRC=src ./build.sh

The default option is set to OFF.

Warning: for some reason the compressed model doesn't work properly and the MCU hangs.

Note: CLEANBUILD=true is only needed if you need to make a clean build otherwise you can skip it. When it's used then depending on your machine it will take quite some time as I'm building all the DSP and NN libs files.

Build with docker

If you want to have the same build environment like the one I've used, then you can use my CDE image for stm32 and docker like this:

./docker-build.sh "CLEANBUILD=true USE_OVERCLOCK=OFF USE_CMSIS_NN=OFF USE_COMP_MODEL=ON SRC=src ./build.sh"

Overclocking

I've added an overclocking flag that overclocks the CPU @ 280. That's maybe too high for every available CPU, but also yours can be clocked even higher. To control the overclocking amount then in the source/src/main.cpp you'll find these lines here:

#ifdef OVERCLOCK
    RCC_OscInitStruct.PLL.PLLN = 288; // Overclock
#endif

You can change that number to the frequency you like. Then you need to build with the `USE_OVERCLOCK" flag, like this:

CLEANBUILD=true USE_OVERCLOCK=ON USE_HAL_DRIVER=ON USE_CMSIS_NN=ON  ./build.sh

Warning: Any overclocking may be the source of unknown issues you may have. In my case I was able to OC up to 285MHz, but sometimes the flatbuffers API was failing at that high frequency! Especially avoid developing with OC.

Using CubeMX

Usually is more convenient to create your project with CubeMX, then after you setup all the hardware and peripherals you can create the code (I prefer SW4STM32, but it doens't really matter in this case). Then after the code is exported then you just need to copy the files that CubeMX customizes for your setup.

The files that usually you need to get and place them in your source/src folder are:

main.h
main.c
stm32f7xx_hal_conf.h
stm32f7xx_hal_msp.c
stm32f7xx_it.h
stm32f7xx_it.c
system_stm32f7xx.c (in case you have custom clocks)

In your case there might be more files. Usually are the files that are in the exported Inc and Src folder.

Serial ports

The code uses 2 serial ports UART6 and UART7. UART6 is the debug port that you can use to run commands from the terminal. UART7 is used from the jupyter notebook for transfering serialized data with flatbuffers.

For the STM32F7-disco board this is the UART6 and UART7 pinout

UART	Tx	Rx
UART6	D0	D1
UART7	A5	A4

Cloning the code

Because this repo has dependencies on other submodules, in order to fetch the repo use the following command:

git clone --recursive -j8 git@bitbucket.org:dimtass/stm32f746-tflite-micro-mnist.git

# or for http
git clone --recursive -j8 https://dimtass@bitbucket.org/dimtass/stm32f746-tflite-micro-mnist.git

Flash

To flash the firmware in Linux you need the texane/stlink tool. Then you can use the flash script like this:

./flash.sh

Otherwise you can build the firmware and then use any programmer you like. The elf, hex and bin firmwares are located in the build-stm32 folder

./build-stm32/*/stm32f7-mnist-tflite.bin
./build-stm32/*/stm32f7-mnist-tflite.hex
./build-stm32/*/stm32f7-mnist-tflite.elf

To flash the HEX file in windows use st-link utility like this: "C:\Program Files (x86)\STMicroelectronics\STM32 ST-LINK Utility\ST-LINK Utility\ST-LINK_CLI.exe" -c SWD -p build-stm32\src_\stm32f7-mnist-tflite.hex -Rst

To flash the bin in Linux: st-flash --reset write build-stm32/src/stm32f7-mnist-tflite.bin 0x8000000

Flatbuffers

You might need to use Google's flatbuffers in case you want to experiment with the serial commands from the python notebook to the stm32f7. These are the commands if you want to build flatbuffers from source and install them (I've used Ubuntu 18.04).

git clone https://github.com/google/flatbuffers.git
cd flatbuffers
cmake -G "Unix Makefiles"
make -j8
sudo make install

The schema file is located in source/schema. To build it then run:

source/schema/create-header.sh

The Python serial port client is in the jupyter_notebook/STM32F7Comm folder. In order to build the schema for Python, run:

flatc --python -o jupyter_notebook/ ./source/schema/schema.fbs

FW details

CMSIS version: 5.0.4
CMSIS-NN version: V.1.0.0
CMSIS-DSP version: V1.7.0
HAL Driver Library version: 1.2.6

License

The license is MIT and you can use the code however you like.

Author

Dimitris Tassopoulos dimtass@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.circleci		.circleci
jupyter_notebook		jupyter_notebook
source		source
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.sh		build.sh
docker-build.sh		docker-build.sh
flash.sh		flash.sh
format.sh		format.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MNIST inference on STM32F746 using TensorFlow Lite for Microcontrollers

Usage

Build

Build with docker

Overclocking

Using CubeMX

Serial ports

Cloning the code

Flash

Flatbuffers

FW details

License

Author

About

Releases

Packages

Languages

License

dimtass/stm32f746-tflite-micro-mnist

Folders and files

Latest commit

History

Repository files navigation

MNIST inference on STM32F746 using TensorFlow Lite for Microcontrollers

Usage

Build

Build with docker

Overclocking

Using CubeMX

Serial ports

Cloning the code

Flash

Flatbuffers

FW details

License

Author

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages