Skip to content

Using LeakyReLU/ReLU break the model when exporting to tflite #12489

Closed
@Corallo

Description

Search before asking

  • I have searched the YOLOv5 issues and found no similar bug report.

YOLOv5 Component

Training, Export

Bug

If you use the leaky relu activation function (or a just a relu), specifying it in the .yaml, the training goes well, but the tflite exported model is broken:

Running:
python3 train.py --data coco.yaml --epochs 50 --weights '' --cfg ./hub/yolov5n-LeakyReLU.yaml --batch-size 204
where yolov5n-LeakyReLU.yaml is the same of yolov5s-LeakyReLU.yaml, with the difference:
width_multiple: 0.25 # layer channel multiple
The performance I get after training are the following, all good:

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.206
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.359
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.209
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.106
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.231
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.265
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.211
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.374
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.430
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.240
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.477
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.561

After exporting:
python3 export.py --weights runs/train/exp20/weights/best.pt --include tflite --int8
and testing the tflite model with :
python3 val.py --weights runs/train/exp20/weights/best-int8.tflite

I get these performances:

                 Class     Images  Instances          P          R      mAP50   mAP50-95: 100%|██████████| 128/128 [00:16<00:00,7.54it/s]
                   all        128        929      0.104     0.0495    0.00417   0.000997

I know quantization should reduce the accuracy, but here it is breaking somehow the network.
Exporting to tflite in fp or onnx doesn't hurt the model.

Any idea what is going on?
In particular, by looking at the output I see that the output of the network that are for the width and height of the box, are always zero after conversion to tflite. The rest of the output seems okay.

Environment

Yolov5 latest
Ubuntu 22.04
python3.10
Nvidia A10G Driver Version: 535.129.03 CUDA Version: 12.2
tensorflow-cpu==2.15.0
torch==2.1.1
torchvision==0.16.1

Minimal Reproducible Example

( I did it on a yolov5n, but it is the same on "s")
python3 train.py --data coco.yaml --epochs 30 --weights '' --cfg ./hub/yolov5s-LeakyReLU.yaml --batch-size 128
(replace exp20 with your folder)
python3 export.py --weights runs/train/exp20/weights/best.pt --include tflite --int8
python3 val.py --weights runs/train/exp20/weights/best-int8.tflite

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    StaleStale and schedule for closing soonbugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions