Skip to content

Custom Classifiers

Lachlan Kermode edited this page Oct 23, 2019 · 1 revision

We developed mtriage to easily apply given bits of code, specified as analysers, on videos, images, and audio, enabling us to algorithmically analyse media at scale. One common way that we use mtriage is to find images that depict useful content using computer vision classifiers and object detection algorithms. Once we've analysed media in this way, we then use mtriage-viewer to present the results in a way that allows researchers to more effectively browse large volumes of videos and images. You can watch our Battle of Ilovaisk investigation for a full visual explanation of this workflow.

While mtriage provides the infrastructure to orchestrate media analysis at scale, it doesn't predetermine how you analyse the input media. In other words, it allows you to plug in any classifier- or any other function that takes images or video as an input for that matter. Through mtriage as an infrastructure, we are now beginning the construction of a 'model zoo' of classifiers that are useful in open source research, so that both our researchers and other journalists and agencies can easily use computer vision to expedite their research and find relevant media more easily.

This is where we are asking your help: to train a range of different classifiers that identify objects or occurrences that are interesting to human rights investigators. Here is a list of some classifiers that would hugely benefit our research. This list is by no means exhaustive, and we welcome additional ideas for objects/contexts of interest. If you are interested in helping to train a classifier, or have a request for a certain kind of classifier, please start a conversation about them on the 'ml-classifiers' channel on our Discord.

Classifiers in progress

  • Image classification/object detection for tear gas canisters (in general).
    • We are actively looking for volunteers to help label training images online. Please write @breezykermo on Discord.
    • We have ~2000 synthetic training images (photorealistically rendered using Unreal) of the Triple Chaser gas canister. We are currently working to render out synthetic datasets for other kinds of tear gas canisters to assist in training.
  • Audio detection of gunfire and/or gas canister fire.
    • We have an analyser that extracts the audio from a video, and our intention is then to use this classifier on that audio to estimate timestamps for where something is fired in Youtube videos and livestreams.
    • Simple probabilistic detection is a good place to start, and would already be useful. The next step would be to see whether it is possible to classify between different types of fire by the audio signature.

Other Ideas

Write @breezykermo on our Discord channel with your ideas and requests.