For my chapGPT I needed to find one or more caps in the image that the user gives to the bot. The typical/most famous algorithms of image/object detection don't recognize the caps, the classes like "bottle", "hand", "chair".. but not a bottle cap.
To achieve it I made my own version of YOLO v5.
You can find how I made it, step by step, here.
General, official, documentation ref:
-
Official ultralytics YOLO v5 documentation: https://docs.ultralytics.com/yolov5/
-
YOLOv5 repo: https://github.com/ultralytics/yolov5
-
Train with Custom Data: https://docs.ultralytics.com/yolov5/tutorials/train_custom_data/
-
Label my caps. Selected 77 images (check an example in the folder
datasets
) and labeled them with LabelImg. Select Output "YOLO". -
Create folders
/images
with all the images and/labels
with all the labels. -
Files
test.txt
,train.txt
andval.txt
with the images (paths) that we are going to use for train (37), test (24) and validation (16). -
Use/clone the Yolo v5 repo to train/finetune the model.
-
Create a file
data.yaml
in the []data
folder of the repo](https://github.com/ultralytics/yolov5/tree/master/data). Check the format in mycaps.yaml
. -
I opened the repo in my
.devcontainer
container, because it's working for me and detecting well my GPU(s). -
The folder
datasets
needs to be in the same workspace thatyolo5
directory, at the same level. -
Train. I executed:
$ python train.py --data caps.yaml --weights yolov5s.pt --img 500
where the caps.yaml
is the file in the repo with the same name. I set as a base the second smaller model yolov5s
and my images has a dimension of 500x500, but they must be multiple of 32, so it automatically updates them to 512.
I left the default hyperparameters:
- weights=yolov5 (yolo v5 small)
- epochs=100
- batch_size=16
- imgsz=500
- optimizer=SGD
- lr0=0.01,
- lrf=0.01,
- momentum=0.937
It took 6 minutes and 18 seconds in total: 100 epochs completed in 0.105 hours.
The best and the last were exp11, in our case.
- Optimizer stripped from runs/train/exp11/weights/last.pt, 14.4MB
- Optimizer stripped from runs/train/exp11/weights/best.pt, 14.4MB
P = 0.9407 R = 0.8817
Check the model in best_run
folder.