Cochl. Sense for Python allows you to add environmental sound event detection ability to your applications. The following tutorial provides step-by-step instructions to set up the Sense SDK on Python. This process includes installing, building, authorizing, and using the SDK. When finished, you have a working sample app to test interactions with Cochl. Sense.
- CPU: For optimal performance, we recommend enough computing power to run the Linux operating system at least such as Raspberry Pi 3+ (ARM Cortex-A Series).
- Memory: A minimum of 10MB for loading machine learning model and input data
- Required sampling rate: 22,050Hz
All users of Sense SDK must obtain authorization credentials to start a project.
- Register or sign in at https://dashboard.cochlear.ai.
- Send an email request for Sense SDK to support@cochlear.ai
- After getting approval from support@cochlear.ai, create a new project: From https://dashboard.cochlear.ai, click “My projects” from the leftside bar, and click “Add new project” to create a new project.
- After the new project is added, select the SDK Type(Android/Python), and click “DOWNLOAD”. (Refresh the page if download button does not appear.)
Install system packages as required by Sense SDK Python. It depends on the target system.
- Ubuntu 18.04 (x86-64)
$ sudo apt-get update
$ sudo apt-get install ffmpeg sox portaudio19-dev virtualenv libssl-dev libcurl4-openssl-dev python3-dev- Mac OS X (x86-64)
Install brew(https://brew.sh/) first.
$ brew update
$ brew install openssl portaudio pyenv python3 wget ffmpeg sox- Raspberry Pi 3 (ARM 32)
$ sudo apt-get update
$ sudo apt-get install ffmpeg sox portaudio19-dev virtualenv libatlas-base-dev libssl-dev libcurl4-openssl-dev python3-pyaudio python3-dev
- NVIDIA Jetson Nano (ARM 64)
$ sudo apt-get update
$ sudo apt-get install ffmpeg sox portaudio19-dev virtualenv python3-dev libffi-dev libssl-dev libcurl4-openssl-devCreate a new virtual environment by choosing a Python interpreter and making a ./venv directory to hold it:
$ virtualenv -p python3 --no-site-packages ./venvActivate the virtual environment using a shell-specific command:
$ source ./venv/bin/activate # sh, bash, ksh, zsh, ...When virtualenv is active, your shell prompt is prefixed with (venv).
Install packages within a virtual environment without affecting the host system setup. Start by upgrading pip:
(venv) $ pip install --upgrade pipTo install Sense SDK Python, download the appropriate Python wheel for your system from the following table, and then install it with the pip install command.
For example, if you're setting up a Jetson Nano (which has Python 3.7), install the Python wheel as follows (after you click to download the .whl file below):
(venv) $ pip install sense_sdk-0.4.2-cp37-cp37m-linux_aarch64.whl- Supported Targets and Package Files
| Ubuntu 18.04 or higher | Mac OS X | |
|---|---|---|
| Python 3.6 | sense_sdk-0.4.2-cp36-cp36m-linux_x86_64.whl | sense_sdk-0.4.2-cp36-cp36m-macosx_10_10_x86_64.whl |
| Python 3.7 | sense_sdk-0.4.2-cp37-cp37m-linux_x86_64.whl | sense_sdk-0.4.2-cp37-cp37m-macosx_10_10_x86_64.whl |
| Python 3.8 | sense_sdk-0.4.2-cp38-cp38m-linux_x86_64.whl | sense_sdk-0.4.2-cp38-cp38-macosx_10_10_x86_64.whl |
| ARM 64 (Jetson Nano, Coral) | ARM 32 (Raspberry Pi 3) | |
|---|---|---|
| Python 3.6 | sense_sdk-0.4.2-cp36-cp36m-linux_aarch64.whl | sense_sdk-0.4.2-cp36-cp36m-linux_armv7l.whl |
| Python 3.7 | sense_sdk-0.4.2-cp37-cp37m-linux_aarch64.whl | sense_sdk-0.4.2-cp37-cp37m-linux_armv7l.whl |
| Python 3.8 | sense_sdk-0.4.2-cp38-cp38-linux_aarch64.whl | sense_sdk-0.4.2-cp38-cp38-linux_armv7l.whl |
Please set your SDK key as the environment variable before executing this example.
$ export SENSE_SDK_KEY=<YOUR SDK KEY>For testing an audio file prediction, run
(venv) $ python examples/simple_file.pyFor testing an audio stream prediction from your microphone, run
(venv) $ python examples/simple_stream.pyEven through Sense SDK Python provides similar API as Sense API Python, they are not the same. It is easy to use because of a simple API.
Import SenseFile into your program:
from cochl.sense_sdk import SenseFileCreate SenseFile object with SDK key and task parameters:
import os
sdk_key = os.environ['SENSE_SDK_KEY']
task = 'emergency'
sense_file = SenseFile(sdk_key, task)Add the audio file name as a parameter of predict() method. Then the prediction result about the audio file will be returned.
- Supported audio file formats:
mp3,wav,ogg,flac,mp4
result = sense_file.predict('some_audio_file.wav')The result format is JSON. You can use conveniently the result using json.loads():
import json
import pprint
result = json.loads(result)
pprint.pprint(result)Note that the below JSON structure is the same as that of Sense API, while its analysis results may slightly differ.
- JSON result format
{
"status" : {
"code" : <Status code>,
"description" : "<Status code description>"
},
"result": {
"task" : "<TASK NAME>",
"frames" : [
{
"tag" : "<CLASS NAME>",
"probability" : <Probability value (float) for 'CLASS NAME'>,
"start_time" : <Prediction start time in audio file>,
"end_time" : <Prediction end time in audio file>,
},
(...)
],
"summary" : [
{
"tag" : "<CLASS NAME>",
"probability" : <Probability mean value (float) for continuous tags>,
"start_time" : <Prediction start time in first tag>,
"end_time" : <Prediction end time in last tag>,
},
(...)
]
}
}The full example code is shown below:
- example_file.py
import json
import os
import pprint
from cochl.sense_sdk import SenseFile
sdkkey = os.environ['SENSE_SDK_KEY']
filename = 'examples/sample_audio/glassbreak.wav'
task = 'emergency'
sense_file = SenseFile(sdkkey, task)
result = sense_file.predict(filename)
result = json.loads(result)
pprint.pprint(result)- result
(venv) $ python example_file.py
INFO: Initialized TensorFlow Lite runtime.
{'result': {'frames': [{'end_time': '1.0',
'probability': '0.9407',
'start_time': '0.0',
'tag': 'Glass_break'},
{'end_time': '1.5',
'probability': '0.9445',
'start_time': '0.5',
'tag': 'Glass_break'}],
'summary': [{'end_time': 1.5,
'probability': 0.9426,
'start_time': 0.0,
'tag': 'Glass_break'}],
'task': 'emergency'},
'status': {'code': 200, 'description': 'OK'}}Check whether your microphone is working or not before doing audio stream prediction.
Import SenseStreamer into your program:
from cochl.sense_sdk import SenseStreamerSenseStreamer object records the audio via the microphone and returns the result of prediction about the audio data each half-second.
We recommend using with statement of the SenseStreamer object. SenseStreamer object supports record() method to record real-time audio and predict() method to predict the audio stream data.
- example_stream.py
import json
import pprint
from cochl.sense_sdk import SenseStreamer
sdkkey = os.environ['SENSE_SDK_KEY']
task = 'human-interaction'
with SenseStreamer(sdkkey, task) as stream:
audio_generator = stream.generator()
for stream_data in stream.record(audio_generator):
result = stream.predict(stream_data)
result = json.loads(result)
pprint.pprint(result)- result
(venv) $ python example_stream.py
INFO: Initialized TensorFlow Lite runtime.
{'result': {'frames': [{'end_time': '1.0',
'probability': '0.9023',
'start_time': '0.0',
'tag': None}],
'summary': [],
'task': 'human-interaction'},
'status': {'code': 200, 'description': 'OK'}}
{'result': {'frames': [{'end_time': '1.5',
'probability': '0.8562',
'start_time': '0.5',
'tag': 'Whistling'}],
'summary': [],
'task': 'human-interaction'},
'status': {'code': 200, 'description': 'OK'}}
{'result': {'frames': [{'end_time': '2.0',
'probability': '0.8946',
'start_time': '1.0',
'tag': 'Whistling'}],
'summary': [],
'task': 'human-interaction'},
(......)The input device can be a parameter of generator() method:
audio_generator = stream.generator(input_device='USB Audio')- NOTE: you can check the input audio device name using
arecord -lcommand.
cochl.sense_sdk
Sense
SenseFile
Audio file prediction model class.
__init__(self, sdkkey, task)Creates an SenseFile object.
Args:
sdkkey: Your SDk key to authenticate SDKtask: Task means what kinds of service you use.- Current supported tasks = ["emergency", "human-interaction"]
predict(self, file_name)Returns the result of the audio file prediction (JSON format).
Args:
file_name: Audio file name to predict
cochl.sense_sdk
Sense
SenseStreamer
Audio stream prediction model class.
__init__(self, sdkkey, task)Creates an SenseStreamer object.
Args:
sdkkey: Your SDk key to authenticate SDKtask: Task means what kinds of service you use.- Current supported tasks = ["emergency", "human-interaction"]
generator(self, input_device=None)Returns the recorded audio data generator.
Args:
input_device: Recording device like a microphone
record(self, generator)Returns the recorded audio data list.
Args:
generator: Audio data generator of SenseStreamer
predict(self, stream_data)Returns the result of the audio stream prediction (JSON format).
Args:
stream_data: Audio stream data to predict
stop(self)Stop the recording audio stream

