mtYOLO: A multi-task model to concurrently obtain the vital characteristics of individuals or animals
This is the official repository for
[2024 IEEE ICME Application/Industry Paper] mtYOLO: A multi-task model to concurrently obtain the vital characteristics of individuals or animals
Kian Eng ONG, Sivaji RETTA, Ramarajulu SRINIVASAN, Shawn TAN, Jun LIU
Information Systems Technology and Design, Singapore University of Technology and Design, Singapore
AnimalEYEQ Private Limited
[Coming Soon] 2024 IEEE International Conference on Multimedia and Expo (ICME) Official Application/Industry Paper
[Coming Soon]
In multi-task learning, a model learns from various related tasks at the same time. Such a model is especially useful in various practical applications in the real-world (e.g., autonomous driving, precision livestock farming), as they are able to perform inference of various tasks concurrently. In this work, we present mt-YOLO, a single unified multi-task YOLOv8 model, that is trained end-to-end and is able to simultaneously produce the output of all the vital characteristics (e.g., size, keypoints) of the person or animal. Our experiments show that our multi-task YOLOv8 model takes a shorter time to train and performs better than individual tasks. The learning of various tasks can mutually benefit one another during model training and improve its performance, however the tasks may sometimes conflict one another and result in poorer model performance. Hence, in order to further enhance the feature extraction capability of the multi-task model and allow it to learn better features from various tasks, we incorporated the Efficient Channel Attention (ECA) mechanism as part of our multi-task unified model architecture. The ECA mechanism dynamically assigns larger weights to more important information but smaller weights to less relevant information. Our experiments showed that ECA can improve the model's performance without compromising too much on the compute time. Our codes can be found at https://github.com/AnimalEyeQ/mtYOLO.
-
MS-COCO Person Multi-Task
-
CattleEyeView dataset
- Download images from https://github.com/AnimalEyeQ/CattleEyeView
- Multi-task annotations can be found in
./data/CattleEyeView
-
The dataset configuration file can be found in
./config/dataset/cattleeyeview_multitask.yaml
or./config/dataset/coco_multitask.yaml
.- Instructions to modify the configurations can be found in the file.
-
Run the following commands to install mtYOLOv8:
cd ultralytics pip install -r requirements.txt
-
The mtYOLOv8 model configuration file and instructions to create other configuration files (e.g., pose, segment, without ECA) can be found in
./config/model/yolov8_multitask_cattleeyeview_ECA.yaml
. -
The code and instructions to train, validate or predict can be found in
mtYOLO.ipynb
. -
The trained mtYOLOv8 with ECA models for MS-COCO Person Multi-Task and CattleEyeView can be found in
./model_checkpoint
.
We would like to express our gratitude to
- ultralytics for the YOLOv8 codes
- @yermandy for the MS-COCO Person Multi-Task dataset and multi-task codes
- Efficient Channel Attention by Wang et al. (2020) and YOLOv8-AM by Chien et al. (2024) for the ECA codes