An annotated dataset for automated detection and counting of estuarine fish in poor visibility conditions

Here we provide an open-access and annotated baited underwater dataset of poor and fair visibility videos for the development of fish detection models and benchmarking of image pre-processing tools. We provide the annotated training annotations and images, and a 12 hour testing dataset with groundtruth MaxN abundance for four target species.

Uses

This dataset can be used

As a computer vision training dataset to monitor estuarine fish in the eastern coast of Australia.
As a benchmark dataset to test image pre-processing techniques (e.g. colour correction).
As a benchmark dataset to test image post-processing techniques (e.g. fish occlussion filters)
To supplement global fish detection models (e.g. see MegaDetector by Microsoft)
To increase accessibility of underwater computer vision tools for aquatic monitoring and environmental science (e.g. see lilascience)

Datasets

We provide access to two datasets: training and testing dataset.

The training dataset is a fully annotatated dataset that contains images, annotations and labels of various fish species. The training dataset includes videos from 2017-2021 of Moreton Bay, Australia across poor visibility secchi depths (2-5 m) from an standard baited underwater video rig with GoPro cameras recording at 1080p.

The testing dataset includes several non-annotated videos from the same location, visibility scenarios and period as the training dataset. The testing dataset can be used to evaluate computer vision fish detection models. The groundtruth is a csv that has manual maximum abundance counts of each fish species across each video. The maximum number of individuals per video were manually determined by researchers at the Moreton Bay Environmental Education Centre.

The training and testing dataset were collected by the Moreton Bay Environmental Education Centre.

Species

The training dataset contains >65,000 segmentation mask annotations of 19 different estuarine fish species from Moreton Bay, Australia. We targeted four species for studies conducted at the Global Wetlands Project. Therefore, these species have a larger number of annotations. We suggest caution when using annotations of the non-targeted species, as these were variably annotated across the dataset. Please contact Sebastian Lopez-Marcano for more information

Species	Num Annotations	Targeted species
Australasian Snapper	9,489	YES
Bengal Sergeant	277	NO
Black-Banded Trevally	89	NO
Blue Catfish	2,411	NO
Blue Swimmer Crab	847	NO
Eastern Striped Grunter	14,631	NO
Eastern Stripey	307	NO
Echinoderm	14	NO
Fanbelly Leatherjacket	190	NO
Gunthers Wrasse	603	NO
Mackerel spp	139	NO
Moses Snapper	53	NO
Paradise Threadfin Bream	10,658	YES
Pinkbanded Grubfish	502	NO
Pomacentrid spp	27	NO
Remora spp	41	NO
Smallmouth Scad	7,067	YES
Smooth Golden Toadfish	5,014	YES
Yellowfin Bream and Tarwhine	11,872	NO

Dataset Links

Dataset	Raw Videos	Raw Images	Version	Num Annotations	Annotations (CSV/JSON)
Training dataset	NA	Download 555 MB	7	8,696	Download 19 MB
Testing dataset	Download 6.5 GB	NA	1	NA	Groundtruth

Annotations

Each annotation includes object instance annotations which consists of the following key fields: Labels are provided as a common name: YellowfinBream for Acanthopagrus australis; bounding boxes that enclose the species in each frame are provided in "[x,y,width,height]" format, in pixel units; Segmentation masks which outline the species as a polygon are provided as a list of pixel coordinates in the format "9x,y,x,y,...]".

The corresponding image is provided as an image filename. All image coordinated (bounding box and segmentation masks) are measured from the top left mage corner and or 0-indexed.

Annotations are provided in both CSV format and COCO JSON format which is a commonly used data format for integration with object detection frameworks including PyTorch and TensorFlow. For more information on annotations files in COCO JSON and/or CSV formats go here.

Attributions

Please use 'CITATION.cff' to cite this dataset.

We kindly request that the following text be included in an acknowledgements section at the end of your publications:

"We would like to thank the Moreton Bay Environmental Education Centre for freely supplying us with the fish dataset for our research. The fish dataset was supported by an AI for Earth grant from Microsoft."

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
CITATION.cff		CITATION.cff
GLOW_logo.png		GLOW_logo.png
README.md		README.md
_config.yml		_config.yml
lowviz_suggestedannotations_mbeec.png		lowviz_suggestedannotations_mbeec.png
manualannotations_mbeec.png		manualannotations_mbeec.png
mbeec_logo.png		mbeec_logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

An annotated dataset for automated detection and counting of estuarine fish in poor visibility conditions

Table of Contents

Overview

Uses

Datasets

Species

Dataset Links

Annotations

Attributions

About

Releases 2

Packages

slopezmarcano/dataset-fish-detection-low-visibility

Folders and files

Latest commit

History

Repository files navigation

An annotated dataset for automated detection and counting of estuarine fish in poor visibility conditions

Table of Contents

Overview

Uses

Datasets

Species

Dataset Links

Annotations

Attributions

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Packages