Skip to content

Conversation

@davidffa
Copy link
Owner

@davidffa davidffa commented Jun 2, 2022

Combines the audio received from all users on the voice channel into an mp3 file.

Uses my koe audio receive implementation (davidffa/koe#2)

WARNING: If the NAS (native audio sending) is enabled, the audio receive system only works simultaneously with the audio sending if using Epoll.

Record payload struct:

{
  op: 'record',
  guildId: 'id',
  id: 'some random id you want',
  selfAudio: record self audio or not (boolean), (optional, default=false)
  users: array of user ids to record, (optional, if not passed, all users will be recorded)
  bitrate: bitrate value, (optional, default = 64000)
  channels: 1 or 2 (int), (optional, default = 2)
  format: 'MP3' | 'PCM' The output audio file format (currently the available formats are PCM and MP3), default is MP3
}
  • The id is used to the identify the recorded audio file when downloading it
  • The mp3 output file sample rate is 48khz
  • To finish recording, simply send record payload only with the guildId
  • When the lavalink finishes processing the audio, it emits a recordFinished event, so you know when you can download the audio file.
    • RecordFinished event struct: { op: 'recordFinished', guildId: <guildid>, id: <the id of the recording> }
  • The mp3 encoding is done by native code, using the C library libmp3lame, so it currently works on darwin-aarch64, linux-x86-64, linux-aarch64 and win-x86-64.

Added events:

You have to add 'Speaking-Events': 'true' on WebSocket headers in order to receive this events

speakingStart and speakingStop only work while recording audio.

  • SpeakingStart (emitted when a user starts speaking in the voice channel)
{
  op: 'speakingEvent',
  event: 'start',
  guildId: 'guild id',
  userId: 'user id'
}
  • SpeakingStop (emitted when a user stops speaking in the voice channel (100ms threshold))
{
  op: 'speakingEvent',
  event: 'stop',
  guildId: 'guild id',
  userId: 'user id'
}
  • Disconnected (emitted when a user leaves the voice channel)
{
  op: 'speakingEvent',
  event: 'disconnected',
  guildId: 'guild id',
  userId: 'user id'
}

REST Endpoints:

Method Endpoint Description
GET /records/:guildId Returns a list with the ids of all recordings from the guild.
GET /records/:guildId/:id Downloads the mp3 audio file.
DELETE /records/:guildId Deletes all records from the guild.
DELETE /records/:guildId/:id Deletes one specific audio file.

TODO:

  • Add record op
  • Decode the opus frames provided by koe, using the lavaplayer libopus bindings (OpusDecoder)
  • Mix the pcm samples received from all users in the voice channel
  • Reduce heap memory allocations (were caused by foreach lol)
  • Encode the pcm frames in mp3 (using native C library libmp3lame)
  • Create REST endpoints to download and delete the recorded files
  • Handle AudioReceiver struct cleaning on voice channel disconnects
  • Mix the bot's audio with the other users

@davidffa davidffa marked this pull request as ready for review June 22, 2022 19:24
@5antos
Copy link

5antos commented Jun 26, 2022

It would be nice to have an option to filter certain users' audios, since some people may want to record only themselves or the bot without receiving the audio from other users connected to the voice channel

@davidffa davidffa merged commit 03cbe36 into dev Jul 8, 2022
@davidffa davidffa deleted the feat/voice-receive branch July 8, 2022 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants