-
-
Notifications
You must be signed in to change notification settings - Fork 16.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TensorFlow and TFLite export #1127
Conversation
@zldrobit thanks for the PR! Can you explain the role of |
@zldrobit oh also, is there a tensorflow version requirement here? We should probably add it to requirements.txt under export: Lines 19 to 24 in 00917a6
|
@glenn-jocher Of course.
|
@zldrobit ah, I see. That's interesting. Yes we use rectangular inference (height != width) in PyTorch and in CoreML our iDetection app. This helps speed up inference significantly, proportional to the smaller area. For example 640x320 inference typically take half the time of 640x640 inference. We resize to --img-size first and pad the shorter dimension as required to reach a 32 multiple, which is the max stride of the current YOLOv5 models. Do you know where the square inference requirement is coming from in tensorflow? |
@glenn-jocher Thanks for your explanation. I didn't express my idea correctly. Line 718 in 4d3680c
Thus, the input image sizes after preprocess are the same. The input size is fixed while exporting TensorFlow and TFLite: Lines 394 to 395 in 23fe35e
Take COCO dataset for example, in int8 calibration, if one use |
@zldrobit Thanks for this great work, I tested shortly but unfortunately it failed with following error. I run in an Android emulator with your libraries specified in gradle. The App is started successful however on model load following exception is thrown: E/tensorflow: CameraActivity: Exception! java.lang.RuntimeException: java.lang.IllegalStateException: Internal error: Unexpected failure when preparing tensor allocations: tensorflow/lite/kernels/add.cc:86 NumInputs(node) != 2 (1 != 2) Node number 5 (ADD) failed to prepare. at org.tensorflow.lite.examples.detection.tflite.YoloV5ClassifierDetect.create(YoloV5ClassifierDetect.java:116) at org.tensorflow.lite.examples.detection.tflite.DetectorFactory.getDetector(DetectorFactory.java:49) at org.tensorflow.lite.examples.detection.DetectorActivity.onPreviewSizeChosen(DetectorActivity.java:83) at org.tensorflow.lite.examples.detection.CameraActivity.onPreviewFrame(CameraActivity.java:253) at android.hardware.Camera$EventHandler.handleMessage(Camera.java:1209) at android.os.Handler.dispatchMessage(Handler.java:106) at android.os.Looper.loop(Looper.java:193) at android.app.ActivityThread.main(ActivityThread.java:6669) at java.lang.reflect.Method.invoke(Native Method) at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:858) Caused by: java.lang.IllegalStateException: Internal error: Unexpected failure when preparing tensor allocations: tensorflow/lite/kernels/add.cc:86 NumInputs(node) != 2 (1 != 2) Node number 5 (ADD) failed to prepare. at org.tensorflow.lite.NativeInterpreterWrapper.allocateTensors(Native Method) at org.tensorflow.lite.NativeInterpreterWrapper.init(NativeInterpreterWrapper.java:87) at org.tensorflow.lite.NativeInterpreterWrapper.(NativeInterpreterWrapper.java:63) at org.tensorflow.lite.Interpreter.(Interpreter.java:266) at org.tensorflow.lite.examples.detection.tflite.YoloV5ClassifierDetect.create(YoloV5ClassifierDetect.java:114) ... 10 more Tested on your tf-android branch. |
@thhart Thanks for using the code. |
@zldrobit I checked with 2.4.0-dev20201011(tf-nightly), now switch backed to 2.3.0 and it works better. Extremely slow in emulator but will check on real device soon... |
@zldrobit Checked fp16 and int8. 450 ms inference time on a Samsung Note 10 (GPU). This is with full 640 size which might a bit too much. But this is out of scope for now of course. Further side notes and question maybe for later enhancement: |
@thhart Thanks for your suggestion. I am now keeping all the TensorFlow and TFLite related code in https://github.com/zldrobit/yolov5/tree/tf-android
to generate TF and TFLite models.
to detect objects. Or put the TFLite models to yolov5/android/app/src/main/java/org/tensorflow/lite/examples/detection/tflite/DetectorFactory.java Lines 32 to 33 in eb626a6
and yolov5/android/app/src/main/java/org/tensorflow/lite/examples/detection/tflite/DetectorFactory.java Lines 42 to 43 in eb626a6
with
to build and run the Android project.
|
You can uncomment Lines 434 to 440 in eb626a6
to generate yolov5s.tflite .Since the default inference precision in TFLite Android is fp16, yolov5s-fp16.tflite is enough for inference.I commented out yolov5s.tflite generating code and leave it as a note.
I think it's an TensorFlow issue about breaking its backward compatibility. |
@zldrobit can you add tensorflow and any other dependencies required to requirements.txt export section? Thanks for the explanation, so that's great news square inference is not required. The actual inference size for iDetection iOS app is 320 vertical by 192 horizontal for accommodating vertical video in any of the 16:9 aspect ratio formats like 4k, 1080p etc. Is it possible to export to tflite in a similar shape? Yes I see about the auto resizing in the dataloader. I'll think about that for a bit. |
Hello, I have some trouble using your branch. I put the code directly below, import argparse
def letterbox(img, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True):
coco_names = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', def wrap_frozen_graph(graph_def, inputs, outputs, print_graph=False):
device = select_device("0") graph = tf.Graph() img = torch.zeros((1, 3, 640, 640), device=device) # init img pred = frozen_func(x=tf.constant(img.permute(0, 2, 3, 1).cpu().numpy())) pred = non_max_suppression(pred, opt.conf_thres, opt.iou_thres, classes=opt.classes,agnostic=opt.agnostic_nms)tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [1,256,40,40] vs. shape[1] = [1,256,24,40] idont why,can help? |
@zjutlzt It seems that you changed the input size of model. A suggestion: you could surround your pasted code in github issue with for better illustration:
|
Sorry,my fault i d not change any export code : and iwant make a minimum code to use yolov5s.pd to detect, so i write this code
But something went wrong tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [1,256,40,40] vs. shape[1] = [1,256,24,40] Function call stack: |
@glenn-jocher Sure. I have updated Yes. Run
to export a TFLite model of 320 vertical by 192 horizontal input , and run one of
with tf-android branch to detect. |
@zjutlzt you could change
to
The |
@zldrobit |
@idenc From the two images, my first guess is the anchors have been changed.
as in #447 (comment). If The follwing code changes anchors by Lines 192 to 193 in 83deec1
Lines 109 to 110 in 83deec1
That's the reason why PS: Reference for auto anchor generation: |
@zldrobit @idenc yes the model anchors may no longer be equal to the yaml definition for a variety of reasons. To access current model anchors: import torch
model = torch.load('yolov5s.pt')['model']
m = model.model[-1] # Detect()
m.anchors # in stride units
m.anchor_grid # in pixel units
print(m.anchor_grid.view(-1,2))
tensor([[ 10., 13.],
[ 16., 30.],
[ 33., 23.],
[ 30., 61.],
[ 62., 45.],
[ 59., 119.],
[116., 90.],
[156., 198.],
[373., 326.]]) /rebase |
Thanks guys, it was due to the anchors changing from auto anchors. Substituting the models anchors into the config.yaml fixed the detection. |
@zldrobit could you read in anchors from the model.pt rather than the yaml? In YOLOv5 we only use the yaml for training, but not afterwards (so we don't run into this situation). |
@glenn-jocher Thanks for your explanation of anchors and yaml files. |
best.pt size = 15.7 mb Hello! after converting, I get int8 file size of about 8 mb. It's normal size? i have attached config files cfg.zip |
@Sergey-sib Hello! It is normal. |
I thought if the full model Yolo5s is 32 float = 15 mb, the result of conversion to 16 float is expected to be ~ 7.5 mb, then conversion to int 8 ~ 3.7.. mb. |
@Sergey-sib I guess you mean the |
Is it possible to use the latest repository https://github.com/ultralytics/yolov5 for training and then convert to tflight using this repository https://github.com/zldrobit/yolov5? |
@Sergey-sib Yes. As long as you are using YOLOv5 of version v2/v3. |
@thaingoc2604 Running https://github.com/zldrobit/yolov5/tree/tf-android on a smartphone would give you the inference delay on the screen.
Just set a lower value to MINIMUM_CONFIDENCE_TF_OD_API (e.g. 0.01~0.1). In such way, you could get more bounding boxes, but there will be more false positive detection. |
@axlecky Maybe you could try switching to GPU inference with the TFLite Android demo. GPU mode is much faster than CPU mode. Plz be aware of the input size. I chose 320x320 input resolution because the 640x640 model runs very slow.
Maybe you have to figure out the details yourself, and I didn't run or test the inference from a single image. Or you could ask @chainyo, since he seems to succeed in detecting with a single image. |
@zldrobit Thanks for the response! |
@chainyo Hi, can you show me what amendments did u make to the code to perform inference from imported phone images/videos? Thanks! |
@zldrobit |
@thaingoc2604 I made that repo just to demonstrate the YOLOv5 TFLite deployment. Maybe you have to consult an Android export about the video input/output part, and I am not that familiar with Android. |
Thank you |
@Kimsoohyeun Plz share some information for us to investigate this problem:
|
@zldrobit Hi. I was using your solution for our app, thanks so much for your job! |
@pcanas The |
Hi @zldrobit. Is the code for building the Android app based on the official Tensorflow Android object detection example in this link? |
@axlecky The yolov5 android repo is partly based on the yolov4 tflite repo. |
@zldrobit python export.py --weights yolov5s.pt --include tflite --int8 --img 320 --data data/coco128.yaml |
The '-data' option is not used in TensorFlow/TFLite export any more. So you could just ignore it. |
Hi @zldrobit, Have you tried training (finetuning) a YOLOv5 model in Keras after this conversion? I noticed there is specific code in the Thanks |
* Add models/tf.py for TensorFlow and TFLite export * Set auto=False for int8 calibration * Update requirements.txt for TensorFlow and TFLite export * Read anchors directly from PyTorch weights * Add --tf-nms to append NMS in TensorFlow SavedModel and GraphDef export * Remove check_anchor_order, check_file, set_logging from import * Reformat code and optimize imports * Autodownload model and check cfg * update --source path, img-size to 320, single output * Adjust representative_dataset * Put representative dataset in tfl_int8 block * detect.py TF inference * weights to string * weights to string * cleanup tf.py * Add --dynamic-batch-size * Add xywh normalization to reduce calibration error * Update requirements.txt TensorFlow 2.3.1 -> 2.4.0 to avoid int8 quantization error * Fix imports Move C3 from models.experimental to models.common * Add models/tf.py for TensorFlow and TFLite export * Set auto=False for int8 calibration * Update requirements.txt for TensorFlow and TFLite export * Read anchors directly from PyTorch weights * Add --tf-nms to append NMS in TensorFlow SavedModel and GraphDef export * Remove check_anchor_order, check_file, set_logging from import * Reformat code and optimize imports * Autodownload model and check cfg * update --source path, img-size to 320, single output * Adjust representative_dataset * detect.py TF inference * Put representative dataset in tfl_int8 block * weights to string * weights to string * cleanup tf.py * Add --dynamic-batch-size * Add xywh normalization to reduce calibration error * Update requirements.txt TensorFlow 2.3.1 -> 2.4.0 to avoid int8 quantization error * Fix imports Move C3 from models.experimental to models.common * implement C3() and SiLU() * Fix reshape dim to support dynamic batching * Add epsilon argument in tf_BN, which is different between TF and PT * Set stride to None if not using PyTorch, and do not warmup without PyTorch * Add list support in check_img_size() * Add list input support in detect.py * sys.path.append('./') to run from yolov5/ * Add int8 quantization support for TensorFlow 2.5 * Add get_coco128.sh * Remove --no-tfl-detect in models/tf.py (Use tf-android-tfl-detect branch for EdgeTPU) * Update requirements.txt * Replace torch.load() with attempt_load() * Update requirements.txt * Add --tf-raw-resize to set half_pixel_centers=False * Add --agnostic-nms for TF class-agnostic NMS * Cleanup after merge * Cleanup2 after merge * Cleanup3 after merge * Add tf.py docstring with credit and usage * pb saved_model and tflite use only one model in detect.py * Add use cases in docstring of tf.py * Remove redundant `stride` definition * Remove keras direct import * Fix `check_requirements(('tensorflow>=2.4.1',))` Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
* Auto TFLite uint8 detection This PR automatically determines if TFLite models are uint8 quantized rather than accepting a manual argument. The quantization determination is based on @zldrobit comment ultralytics#1127 (comment) * Cleanup
I haven't trained YOLOv5 models in Keras, but I considered doing so before. The code in the repo after v6.1 (including) exports TF YOLOv5 mode's with untrainable parameters.
Other than TFDetect, exported TF models includes additional normalization for bboxes' xywh. These xywh are denormalized in inference. Is this what you're looking for? |
@zldrobit I want to use the new instance segmentation model introduced by YOLOv5, how should I modify the code? |
* Auto TFLite uint8 detection This PR automatically determines if TFLite models are uint8 quantized rather than accepting a manual argument. The quantization determination is based on @zldrobit comment ultralytics/yolov5#1127 (comment) * Cleanup
@zldrobit, Shouldn't |
@Aandre99 Despite the default precision (fp32) of PyTorch, the YOLOv5 model is saved as half precision if the training is completed. In Lines 1002 to 1015 in 915bbf2
|
@zldrobit thanks,I finally understand now! |
Issue with loading TFLite model on Jetson Nano 2GB developer kit ( without GPU ) We are currently facing an issue while attempting to load a TFLite model on our Jetson Nano device (running Jetpack 4.6.1 and TensorFlow 2.4.1) using Torch. Previously, we successfully loaded a .pt model using the following code: model = torch.hub.load('ultralytics/yolov5:v6.0', 'yolov5s') However, when we tried to load the TFLite version of the model with the following code: model = torch.hub.load('ultralytics/yolov5:v6.0', 'yolov5s.tflite', force_reload=True) We encountered the following error: RuntimeError: Cannot find callable yolov5s.tflite in hubconf We understand that this error might be related to compatibility issues, and we opted for version v6.0 of the YOLOv5 model for Jetson compatibility. Could you please provide guidance or insights on how to resolve this issue or any alternative approaches to load the TFLite model successfully on the Jetson Nano? Thank you for your assistance. |
@JonSnow4243 TFLite models cannot be loaded by Lines 546 to 573 in 41603da
|
* Auto TFLite uint8 detection This PR automatically determines if TFLite models are uint8 quantized rather than accepting a manual argument. The quantization determination is based on @zldrobit comment ultralytics/yolov5#1127 (comment) * Cleanup
Since this PR has been merged into the master branch (and some code changes), TensorFlow/TFLite models can be exported using
and validated using
After exporting TFLite models, https://github.com/zldrobit/yolov5/tree/tf-android can be used as an Android demo.
For Edge TPU model export, plz refer to #3630.
Original export method (obsoleted)
This PR is a simplified version of (https://github.com//pull/959), which only adds TensorFlow and TFLite export functionality.Export TensorFlow models (GraphDef and saved model) and fp16 TFLite models using:
Export int8 quantized TFLite models using:
Run TensorFlow/TFLite model inference using:
For TensorFlow.js
Export
*.pb*
model with class-agnostic NMS (tf.image.non_max_suppression)Convert
*.pb
to a tfjs model with:Edit
weights/web_model/model.json
to shuffle output node order (see tensorflow/tfjs#3942):"signature": {"outputs": {"Identity": {"name": "Identity"}, "Identity_3": {"name": "Identity_3"}, "Identity_1": {"name": "Identity_1"}, "Identity_2": {"name": "Identity_2"}}}
->"signature": {"outputs": {"Identity": {"name": "Identity"}, "Identity_1": {"name": "Identity_1"}, "Identity_2": {"name": "Identity_2"}, "Identity_3": {"name": "Identity_3"}}}
.using:
Deploy the model with tfjs-yolov5-example.
EDIT:
--img
from 640 to 320.--no-tfl-detect
. It is now only used intf-android-tfl-detecttf-edgetpu branch for Edge TPU model export.--tf-raw-resize
option to map resize ops to EdgeTPU, which accelerates inference (not necessary with Edge TPU compiler v16).--cfg
since the config yaml is stored in the model weights now.FAQ:
java.lang.IllegalArgumentException: Cannot copy to a TensorFlowLite tensor (serving_default_input_1:0) with 1769472 bytes from a Java Buffer with 1228800 bytes.
): see Add TensorFlow and TFLite export #1127 (comment)Can not open OpenCL library on this device - dlopen failed: library "libOpenCL.so" not found
): see Add TensorFlow and TFLite export #1127 (comment)python detect.py
is only for validation purposes. If you're interested in running YOLOv5 TFLite models on Android, plz refer to https://github.com/zldrobit/yolov5/tree/tf-android.🛠️ PR Summary
Made with ❤️ by Ultralytics Actions
🌟 Summary
Added TensorFlow (TF) and TensorFlow Lite (TFLite) model support to the YOLOv5 detect.py script.
📊 Key Changes
numpy as np
and TensorFlow (import tensorflow as tf
) indetect.py
.--tfl-int8
) to enable INT8 quantized TFLite model inference.models/tf.py
to convert YOLOv5 models to TensorFlow formats.requirements.txt
to include TensorFlow as an optional dependency.🎯 Purpose & Impact