How can we best prepare for the fall 2024 hackathon? #142

scottveirs · 2024-04-27T02:12:47Z

scottveirs
Apr 27, 2024
Maintainer

pastorep · 2024-04-28T00:34:11Z

pastorep
Apr 28, 2024
Maintainer

Thanks so much for raising this! As I pull together our ML resources (documentation, process, platform, etc.), I'll get a better idea of goals for the ML perspective

0 replies

scottveirs · 2024-05-03T17:51:06Z

scottveirs
May 3, 2024
Maintainer Author

Any thoughts at this juncture, Val @veirs or Dave @dbainj1 ?

For fun, I'll invite the HALLO crowd to chime in as well. Perhaps we can entrain some Canadian participants in the Microsoft hackathon this September?

0 replies

dbainj1 · 2024-05-03T18:54:14Z

dbainj1
May 3, 2024
Maintainer

Scott, I'd welcome participation by the Canadians. My current issue is the login process. With Microsoft's new security protocol, it takes me about 40 seconds to login. Being able to stay logged in for longer would be really helpful. Being logged in permanently on a trusted device would be even better. Retraining the model would be another good activity. We're up to 600 minutes with confirmed detections. We also have over 4200 minutes with false positives. That should be an adequate sample to improve the model, and think about hydrophone specific models. A "heartbeat" monitor would be really helpful. I think a confidence level gets calculated every minute for each hydrophone. Having those values reported out so we can get an idea of whether things are working or not would be valuable. It may also help us figure out how to set up an alarm system for when things aren't working. As the system is now, not receiving notifications could mean the system is working perfectly and not generating any false positives, or that it's not working at all. On the requests for review, it would be helpful to include the hydrophone site. E.g., a high percentage of the notifications from Point Robinson are false positives, while notifications from Orcasound Lab are more likely to be true positives. That could affect how quickly we try to get to reviewing a tentative detection. There seems to be a 20-25 minute delay between detection and notification. For now, that's not a problem, but once we link to ship notifications, minimizing that delay will be essential. Automating that link into the system may be timely at the next Hackathon. Updating our list of reviewers and end users would be another simple but valuable task. While there's room for improvement, I think the most valuable thing we can be doing is getting more hydrophones in and maintaining the ones we've got. If we make progress in that arena, incorporating the new hydrophones into OrcaHello would be another useful task. Those are the things that immediately come to mind. As I think of more things, I'll let you know.

…

--Dave

________________________________ From: Scott Veirs ***@***.***> Sent: Friday, May 3, 2024 10:51 AM To: orcasound/aifororcas-livesystem ***@***.***> Cc: David Bain ***@***.***>; Mention ***@***.***> Subject: Re: [orcasound/aifororcas-livesystem] How can we best prepare for the fall 2024 hackathon? (Discussion #142) Any thoughts at this juncture, Val @veirs<https://github.com/veirs> or Dave @dbainj1<https://github.com/dbainj1> ? For fun, I'll invite the HALLO crowd to chime in as well. Perhaps we can entrain some Canadian participants in the Microsoft hackathon this September? — Reply to this email directly, view it on GitHub<#142 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AWBBIJ3VDZTINYQWPY5SH2TZAPFCBAVCNFSM6AAAAABG3V4VA2VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TGMBYGM2TI>. You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

bnestor · 2024-05-03T19:31:35Z

bnestor
May 3, 2024
Maintainer

Hi,
There are a few options for assessing the performance of classifiers:

Assessing performance for Future Hackathons

DeepAL Compare dataset (on NRKW) from this work: https://www.nature.com/articles/s41598-022-26429-y I think the data are only accessible after signing a data use agreement, so it is not quite scalable.
We have 400 labelled examples from 2023 data at 2 locations within the ONC network. They will be released shortly on Huggingface as well. It is simply labelled as marine mammal present/absent, so it includes srkw, nrkw, transients, humpbacks, and sea lions. It was labelled by Jasper Kanes at ONC as well as 3 amateur labellers with Cohen's Kappa of 0.714, 0.580, and 0.229 when comparing to the expert annotation.
Paperswithcode hosts leaderboards for classification on specific datasets. Authors can upload their papers and results so that they are indexed somewhere: https://paperswithcode.com/area/audio
Other managed benchmarks like Kaggle and WILDS have their own managed leaderboards: https://wilds.stanford.edu/leaderboard/
The data could also be included in DCASE task 5. It is an annual challenge/workshop that has bioacoustic event detection datasets across species: https://dcase.community/challenge2023/task-few-shot-bioacoustic-event-detection-results

Benchmarking Multiple Models

I have benchmarked the following models for accuracy and computational performance across DeepAL Compare, the 400 ONC labels, and ~2 years of Oceans Observatories Initiative data:
PAMGuard LDA: (Only able to run on DeepAL Compare due to computational cost
PAMGuard/ROCCA: Performs somewhat above random, but takes 10 minutes to make a prediction on 5 minutes of data when you are not selecting the contours manually.
ANIMAL-SPOT: (both trained on NRKW DeepAL only, and fine-tuned on orcasound/ONC data) This is a ResNet style spectrogram classifier. It performed well on the short DeepAL Compare segments it was designed for, but did not generalise well to other datasets.
Wav2vec2/HuBERT Fine-tuned: These are Facebook/Google's state-of-the-art speech recognition models. They are trained on 16khz human speech, so they do not do too well on orcas
Whisper: This is OpenAI's model. It performs competitively on all of the test sets and is pretty simple to train with huggingface.
Wav2vec2 U Pretrain then Fine-tune: When wav2vec2 is pretrained on 32khz cetacean data it performs competitively. Compared to whisper, it is nearly twice as fast, but requires 3x more power, 3x more GPU memory, and slightly more RAM.

My current suggested workflow is: run an efficient classifier on all data. This would allow us to discard about 80% of all samples that are very low probability. Models with a higher probability can be sent to a server for classification with a better performing model for confirming detection, species classification, individual classification, call type classification etc.

Future considerations

I've also ventured into species classification as well. The unsupervised Wav2Vec2 model clearly separates humpbacks from orcas. The transient orca calls are completely overlapping with a subset of southern resident calls, but the rest of the southern resident calls are not overlapping with transient nor humpback calls. With a bit of supervised learning I anticipate that we will be able to separate them completely. Right now I need some more annotated data before I train something I would depend upon (especially for offshore and northern residents)

The full 1603 hours of vocalisation data I have collected from Orcasound and ONC is on huggingface. These data can be streamed from huggingface for inference without the need to download the ~600GB of data locally. I haven't uploaded any negative examples, since there are way too many, but I have noted the files which I have evaluated and deemed not to contain cetaceans. * note: the ONC data is not public yet as I am still shifting around the labels. Once I get it to a stable place I will make it public and switch over to version control.

I will upload all my pre-trained and fine-tuned models to huggingface once I finish up the tuning and evaluation.

I don't have too many comments on the infrastructure side of things, but I do have a preference for having a near real-time folder of 5-minute long FLAC files I can sync from the cloud.

3 replies

scottveirs Jul 20, 2024
Maintainer Author

@pastorep this may be worth reviewing and possibly turning into some new issues (stretch goals?) for the 2024 hackathon's ML team.

paulcretu Sep 15, 2024
Maintainer

I do have a preference for having a near real-time folder of 5-minute long FLAC files I can sync from the cloud

This is good to know! When you say FLAC @bnestor, do you mean the actual lossless source audio, or just the lossy AAC encoded stream audio but transcoded to FLAC for easier handling / ingest into models?

bnestor Sep 15, 2024
Maintainer

@paulcretu Either way works. I am not too picky. It would be nice to be able to rapidly download individual files. ONC has lossless files available in flac or wav in 5-minute increments using an API (token required). OOI has lossless files in "mseed" format for 5 minute files that can be retrieved with wget.

KSJasperK · 2024-05-03T20:31:28Z

KSJasperK
May 3, 2024

I would really love to see a hackathon tackle the HB/KW problem some day! I'm not sure how much this comes up in Orcasound data, but in ONC data it's a huge issue. Looking for orcas in "orca" detections in our data can feel like looking for a needle in a humpback hay stack.

Also a quick add-on to Bret's comment about labels - what Bret will be submitting was labelled as presence/absence, but I couldn't help myself from retaining species information for my own records. So all of those labels I produced exist as species labels too that I'd be happy to share if someone might find them useful. I'm also sitting on a mountain of humpback labels from other ONC hydrophones that I've produced while searching through "orca detections" to look for Biggs - I figured one man's signal is another man's noise and someone might want them!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can we best prepare for the fall 2024 hackathon? #142

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

How can we best prepare for the fall 2024 hackathon? #142

scottveirs Apr 27, 2024 Maintainer

Replies: 5 comments · 3 replies

pastorep Apr 28, 2024 Maintainer

scottveirs May 3, 2024 Maintainer Author

dbainj1 May 3, 2024 Maintainer

bnestor May 3, 2024 Maintainer

Assessing performance for Future Hackathons

Benchmarking Multiple Models

Future considerations

scottveirs Jul 20, 2024 Maintainer Author

paulcretu Sep 15, 2024 Maintainer

bnestor Sep 15, 2024 Maintainer

KSJasperK May 3, 2024

scottveirs
Apr 27, 2024
Maintainer

Replies: 5 comments 3 replies

pastorep
Apr 28, 2024
Maintainer

scottveirs
May 3, 2024
Maintainer Author

dbainj1
May 3, 2024
Maintainer

bnestor
May 3, 2024
Maintainer

scottveirs Jul 20, 2024
Maintainer Author

paulcretu Sep 15, 2024
Maintainer

bnestor Sep 15, 2024
Maintainer

KSJasperK
May 3, 2024