Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detecting small objects in large images #3884

Closed
mansi-aggarwal-2504 opened this issue Jul 5, 2021 · 10 comments
Closed

Detecting small objects in large images #3884

mansi-aggarwal-2504 opened this issue Jul 5, 2021 · 10 comments
Labels
question Further information is requested

Comments

@mansi-aggarwal-2504
Copy link

mansi-aggarwal-2504 commented Jul 5, 2021

I have some images which are 6000 × 4000 pixels in size ( relatively large images) and the objects that have been labelled here are quite small. I trained the model and had to set the --img-size parameter to 1280 as anything higher than that was throwing RuntimeError: CUDA out of memory error.

When I ran detect.py, I received zero detections. Even at 0.1 conf-thres.

Any idea how I can detect very small objects in large images?

@mansi-aggarwal-2504 mansi-aggarwal-2504 added the question Further information is requested label Jul 5, 2021
@tetelevm
Copy link

tetelevm commented Jul 5, 2021

As far as I know, YOLO has a minimum object size in the image. I would advise you to split the image into 9 images with an overlap of 100 -> (0-40%), (30-70%), (60-100%) in width and height and search for objects on them with a smaller network.

@mansi-aggarwal-2504
Copy link
Author

I also had that idea but do you know of any way I could adjust the labels to cropped images. Since I have ground truth labels (xml and txt) for the full images.

@Melhaya
Copy link

Melhaya commented Jul 5, 2021

I also did some cropping to images I am using for training. reverse the normalization of the labels and adjust the X-center and Y-Center according to the cropping you did and then Normalize them back again. It will take a couple of trials to get it correct but it works. Just make sure you visualize the new labels on your cropped images as a sanity check. :)

@glenn-jocher
Copy link
Member

glenn-jocher commented Jul 5, 2021

@tetelevm @mansi-aggarwal-2504 @Melhaya actually a simpler solution might be to train on native resolution crops and then to run inference at batch-size 1 with native resolution. You can do this using the new Albumentations integration in #3882 and using i.e. A.RandomCrop(width=1280, height=1280).

@mansi-aggarwal-2504
Copy link
Author

Hi @glenn-jocher, thank you for this tip.
Just wanted to clarify, when you said "train on native resolution crops", you're still suggesting I crop the 6000x4000 images into smaller images (say 6 smaller images), and run inference on similar cropped images also?

@glenn-jocher
Copy link
Member

@mansi-aggarwal-2504 I wouldn't crop anything. It's a large developmental expense to create that pipeline.

@mansi-aggarwal-2504
Copy link
Author

@glenn-jocher right. So you're saying I can use the 6000x4000 images and train on them and run the inference at batch-size 1 with same resolution images. And benefit from the new Albumentations integration.
Just making sure I understand you correctly.

@glenn-jocher
Copy link
Member

@mansi-aggarwal-2504 yes you can train at full size with random crops, and then run inference at full size.

@mansi-aggarwal-2504
Copy link
Author

Thank you for your help @glenn-jocher!

@fcakyon
Copy link
Member

fcakyon commented Sep 19, 2021

@mansi-aggarwal-2504 @Melhaya @tetelevm you can perform sliced inference with lower image sizes and lower gpu memory usage on a yolov5 model using SAHI package: https://github.com/obss/sahi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants