Skip to content

deeptimahesh/cricket-vision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

cricket-vision

Exploring CV concepts with cricket visual data

1. Audio Extractions

  • To classify the acoustic events from a stream of sounds recorded in a cricket shot, the sound needs to be segmented into small sound windows on which a machine learning model is applied.
  • Traditional acoustic analysis systems often partition the sound into equally-sized windows starting from the beginning of the soundrecording. However, since cricket events are sparsely distributed overtime and usually have a very short duration, the above approach may cause the sound of interest span over two different windows, which degrades the sound classification accuracy. To tackle this problem, we implement a peak detection approach that detects thesound of interest and isolates it within a single window.

The next step is to detect or determine which window corresponds to what event, else detect which window holds the striker's swing/contact with ball.

Acoustic Data Steps

  • MFCC feature extraction
  • ML models can be learnt but unncessary for now, as it involves only one event detection and as such no classification.

Need to look up

  1. Study with frequency, pitch and fft etc, involved signal processing

  2. Analyze HIERARCHICAL LANGUAGE MODELING FOR AUDIO EVENTS DETECTIONIN ASPORTS GAME http://www.cvssp.org.uk/acasva/Publications/ICASSP_2010.pdf

References (Audio)

2. Video Extractions

  • Action detection in videos that learns to directly predict the temporal bounds of said action
  • Formulate a model as a recurrent neural network-based agent that interacts with a video overtime. The agent observes video frames and decides bothwhere to look next and when to emit a prediction.

Another method

  • It is mainly composed of three steps:

      (i) temporal sliding window,
      (ii) clip representation and classification, and
      (iii) post processing.
    
  • Set the length of window as 150 frames and the slidingstep as 100 frames.

  • Then, for each short temporal window, we perform the task of action recognition independently. Eventually, the recognition results of theseshort window are combined to yield the final result of the whole video stream.

References (Video)

3. Ball Detection

References (Ball Detection)

About

Exploring CV concepts with cricket visual data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published