purpose: correctly output bounding boxes around the objects and predict what the objects are. Here Im trying to reimplement the YOLOv1 from scratch on the PascalVOC dataset available on Kaggle in the following link: https://www.kaggle.com/dataset/734b7bcb7ef13a045cbdd007a3c19874c2586ed0b02b4afc86126e89d00af8d2
I forked the code is made by @Aladdin on this dataset (the actual code with the details are available), and tried to rewrite, implement and retrain it on the provided dataset. This is a PyTorch-based code and works properly. For the 100 examples, it gives 0.93 mean average precision with the loss of 28.58.