Stars
AudioLDM training, finetuning, evaluation and inference.
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
Official PyTorch implementation of BigVGAN (ICLR 2023)
Deep neural networks for voice conversion (voice style transfer) in Tensorflow
Singing Voice Conversion via diffusion model
An audio-plugin for multi-channel AB-comparison of several input signals.
C++17 port of Demucs v3 (hybrid) and v4 (hybrid transformer) models with ggml and Eigen3
Model for MDX23 music separation contest
Generative models for conditional audio generation
iOS App project shows examples related to video recording/editting.
High-performance and flexible video editing and effects framework, based on AVFoundation and Metal.
Komodio is a Video Client for Kodi written in SwiftUI for macOS Sonoma, tvOS 17, iPadOS 17 and visionOS
Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Code release for our CVPR 2023 paper "Detecting Everything in the Open World: Towards Universal Object Detection".
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
Code and theory of a look-ahead compressor / limiter.
A high-throughput and memory-efficient inference and serving engine for LLMs
A Python implementation of the Speech Intelligibility Index
Digital audio processors such as compressor/limiter, gate/expander, flanger, multi-tap delay, and others.
Raspberry Pi guitar pedal using neural networks to emulate real amps and effects.
Direct design of biquad filter cascades with deep learning by sampling random polynomials.
Latency measurement tool
PFFDTD is an open-source FDTD simulator for 3D room acoustics