Use ffmpeg to send input to opensmile to get features? #35

aniketzz · 2021-12-09T07:39:04Z

I want to use FFMEPG to send input to the opensmile and generate the features from egemaps, prosody or mfcc.
I am able to modify the config files to get the live input but now I want to take the input from a video source and extract audio via ffmpeg and send it to opensmile.

chausner-audeering · 2021-12-09T08:41:50Z

There is a cFFmpegSource component but it only supports input from a file. If you want to use FFmpeg for live audio recording, you will need to do the recording outside of openSMILE and pass the data via SMILEapi and cExternalAudioSource to openSMILE. For more information, see https://audeering.github.io/opensmile/reference.html#smileapi-c-api-and-wrappers.

aniketzz · 2021-12-09T11:35:47Z

Can you please elaborate? I am getting some trouble understanding where and what to change.
For example: when I looked at SMILEapi, I did not understand where the input was coming from.
How do I call cExternalAudioSource? For using local device microphone I am using the below code in config:

[waveIn:cPortaudioSource]
writer.dmLevel=wave
monoMixdown = 0
 ; -1 is the default device, set listDevices=1 to see a device list
device = -1
listDevices = 0
sampleRate = 16000
 ; if your soundcard only supports stereo (2-channel) recording, 
 ; use channels=2 and set monoMixdown=1
channels = 1
nBits = 16
audioBuffersize_sec = 0.050000
buffersize_sec=2.0

chausner-audeering · 2021-12-09T12:12:01Z

Documentation on SMILEapi is unfortunately rather sparse. Basically, it boils down to:

Replacing cPortaudioSource in the config with cExternalAudioInput
Using SMILEapi to load and run the config file
Passing audio data via SMILEapi to the cExternalAudioInput component

SMILEapi is a C API for maximum compatibility with other languages. openSMILE includes a Python wrapper which is recommended if you are working in Python.

You might also want to take a look at the implementation of https://github.com/audeering/opensmile-python which under the hood uses SMILEapi via the Python wrapper.

aniketzz · 2021-12-10T12:01:36Z

Is there any way to get the data per frameTime in realtime for prosody, mfcc and egemaps in opensmile?
I am able to configure the API to generate the features for prosody, mfcc and egemaps.
The current input is a file. How do I get the features in realtime using the API? currently, it generated the data as a series in one go.

Also, What will be the way to use ffmpeg with the api? I see that I have to pass the data(audio file) generated by ffmpeg or can I stream data via ffmpeg and pass it.

chausner-audeering · 2021-12-10T13:06:11Z

When using SMILEapi in combination with eExternalSink, you will get the features in real-time as soon as they are generated.

Also, What will be the way to use ffmpeg with the api?

You can stream audio in real-time from FFmpeg to openSMILE. You'll need to set up the audio recording with FFmpeg, and then pass each individual buffer of audio received from FFmpeg to openSMILE via the SMILEapi function smile_extaudiosource_write_data.

aniketzz · 2021-12-10T14:02:15Z

What will be the way to use FFmpeg with the python API?
How do I get the features in real-time using the python API?
I have changed the config to:

[waveIn:cFFmpegSource]
writer.dmLevel = wave
blocksize_sec = 1.0
filename = \cm[inputfile(I){test.wav}:name of input file]
monoMixdown = 1.0
outFieldName = pcm
However, it takes input from a file but I want to take input from a port.
For example, I'll be sending an audio file through 8000 port and I want to pass this input to the open smile python API

chausner-audeering · 2021-12-11T16:52:34Z

cFFmpegSource only supports input from files. If you need to receive an audio stream via the network and you want to decode it using FFmpeg, I suggest to ask in the FFmpeg forums or maybe StackOverflow for help. I can help you with passing the audio via the SMILEapi interface to openSMILE.

To get started with SMILEapi, see the API definition and comments in https://github.com/audeering/opensmile/blob/master/progsrc/smileapi/python/opensmile/SMILEapi.py. See also the help in the openSMILE documentation on components cExternalAudioSource and cExternalSink.

aniketzz · 2021-12-12T06:56:18Z

We have ffmpeg command ready to decode the audio which is coming from the UDP port, but How do we integrate the command into the opensmile python API?

aniketzz · 2021-12-15T11:09:35Z

We have ffmpeg command ready to decode the audio which is coming from the UDP port, but How do we integrate the command into the opensmile python API?

can anyone help me with the above query?

chausner-audeering mentioned this issue Dec 11, 2021

get features in realtime audeering/opensmile-python#50

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use ffmpeg to send input to opensmile to get features? #35

Use ffmpeg to send input to opensmile to get features? #35

aniketzz commented Dec 9, 2021 •

edited

Loading

chausner-audeering commented Dec 9, 2021

aniketzz commented Dec 9, 2021 •

edited

Loading

chausner-audeering commented Dec 9, 2021

aniketzz commented Dec 10, 2021 •

edited

Loading

chausner-audeering commented Dec 10, 2021

aniketzz commented Dec 10, 2021

chausner-audeering commented Dec 11, 2021

aniketzz commented Dec 12, 2021

aniketzz commented Dec 15, 2021

Use ffmpeg to send input to opensmile to get features? #35

Use ffmpeg to send input to opensmile to get features? #35

Comments

aniketzz commented Dec 9, 2021 • edited Loading

chausner-audeering commented Dec 9, 2021

aniketzz commented Dec 9, 2021 • edited Loading

chausner-audeering commented Dec 9, 2021

aniketzz commented Dec 10, 2021 • edited Loading

chausner-audeering commented Dec 10, 2021

aniketzz commented Dec 10, 2021

chausner-audeering commented Dec 11, 2021

aniketzz commented Dec 12, 2021

aniketzz commented Dec 15, 2021

aniketzz commented Dec 9, 2021 •

edited

Loading

aniketzz commented Dec 9, 2021 •

edited

Loading

aniketzz commented Dec 10, 2021 •

edited

Loading