-
Notifications
You must be signed in to change notification settings - Fork 45.3k
Closed
Description
System information
- What is the top-level directory of the model you are using: Skip
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
- TensorFlow installed from (source or binary): binary
- TensorFlow version (use command below): 1.8.0 gpu
- Bazel version (if compiling from source): None
- CUDA/cuDNN version: CUDA 9 cuDNN 7
- GPU model and memory: 1080Ti, 12GB
- Exact command to reproduce: No
Describe the problem
When I config hard example mining iou_threshold = 1.0 other than 0.99 or 0.7 (ssd paper use 1.0, no OHEM), the model run very slow when training. I digged into the code and found it will do a full-boxes NMS, which is a waste of CPU resources.
As iou_threshold = 1.0, it is no need to do NMS, just return all boxes is enough. Here is a pull request to fix this problem: #4874.
Test on a customized model I use:
Before fix: 4.5 secs/step
After fix: 0.4 secs/step
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels