gorinars

🛠️

Arseniy Gorin gorinars

🛠️

ML Researcher: Audio and Speech

37 followers · 30 following

Achievements

Lists (1)

Sort

🔮 Future ideas

1 repository

Stars

AudioLLMs / Awesome-Audio-LLM

Audio Large Language Models

Python 350 21 Updated Jan 15, 2025

EmulationAI / awesome-large-audio-models

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

649 38 Updated Aug 3, 2024

OpenMOSS / SpeechGPT-2.0-preview

Python 191 13 Updated Jan 27, 2025

multimodal-art-projection / AutoKaggle

Python 186 13 Updated Dec 4, 2024

MatthewCYM / VoiceBench

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 107 6 Updated Feb 5, 2025

multimodal-art-projection / YuE

YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open

Python 2,950 276 Updated Feb 5, 2025

Stability-AI / stable-codec

A family of state-of-the-art Transformer-based audio codecs for low-bitrate high-quality audio coding.

Python 317 18 Updated Jan 14, 2025

hubertsiuzdak / snac

Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate

Python 476 26 Updated Nov 19, 2024

westlake-baichuan-mllm / bc-omni

Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊

260 7 Updated Jan 27, 2025

BUTSpeechFIT / DiCoW

Python 19 1 Updated Jan 10, 2025

OpenBMB / UltraEval-Audio

An easy-to-use, fast, and easily integrable tool for evaluating audio LLM

Python 28 Updated Jan 24, 2025

espeak-ng / espeak-ng

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

C 4,579 948 Updated Jan 31, 2025

thewh1teagle / kokoro-onnx

TTS with kokoro and onnx runtime

Python 1,432 127 Updated Feb 5, 2025

jishengpeng / WavChat

A Survey of Spoken Dialogue Models (60 pages)

257 16 Updated Nov 28, 2024

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 39,841 4,470 Updated Jan 18, 2025

pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 86,517 23,286 Updated Feb 6, 2025

vocodedev / vocode-core

🤖 Build voice-based LLM agents. Modular + open source.

Python 3,125 520 Updated Nov 15, 2024

fixie-ai / ultravox

A fast multimodal LLM for real-time voice

Python 3,396 228 Updated Jan 31, 2025

Azure-Samples / aoai-realtime-audio-sdk

Azure OpenAI code resources for using gpt-4o-realtime capabilities.

TypeScript 748 146 Updated Jan 22, 2025

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

13,740 885 Updated Jan 28, 2025

VITA-MLLM / VITA

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,024 149 Updated Jan 21, 2025

openai / openai-realtime-console

React app for inspecting, building and debugging with the Realtime API

JavaScript 2,847 1,035 Updated Feb 1, 2025

chonkie-ai / autotiktokenizer

🧰 The AutoTokenizer that TikToken always needed -- Load any tokenizer with TikToken now! ✨

Python 35 3 Updated Jan 3, 2025

VITA-MLLM / Freeze-Omni

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Python 265 16 Updated Jan 2, 2025

openai / openai-realtime-api-beta

Node.js + JavaScript reference client for the Realtime API (beta)

JavaScript 851 243 Updated Nov 7, 2024

pipecat-ai / pipecat

Open Source framework for voice and multimodal conversational AI

Python 4,544 504 Updated Feb 5, 2025

kyutai-labs / moshi

Python 7,348 582 Updated Feb 5, 2025

ga642381 / speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

872 57 Updated Feb 5, 2025

stanfordnlp / dspy

DSPy: The framework for programming—not prompting—language models

Python 21,666 1,639 Updated Feb 5, 2025

michaelhodel / arc-dsl

Domain Specific Language for the Abstraction and Reasoning Corpus

Python 233 48 Updated Oct 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arseniy Gorin gorinars

Achievements

Achievements

Block or report gorinars

Lists (1)

🔮 Future ideas

Stars

AudioLLMs / Awesome-Audio-LLM

EmulationAI / awesome-large-audio-models

OpenMOSS / SpeechGPT-2.0-preview

multimodal-art-projection / AutoKaggle

MatthewCYM / VoiceBench

multimodal-art-projection / YuE

Stability-AI / stable-codec

hubertsiuzdak / snac

westlake-baichuan-mllm / bc-omni

BUTSpeechFIT / DiCoW

OpenBMB / UltraEval-Audio

espeak-ng / espeak-ng

thewh1teagle / kokoro-onnx

jishengpeng / WavChat

RVC-Boss / GPT-SoVITS

pytorch / pytorch

vocodedev / vocode-core

fixie-ai / ultravox

Azure-Samples / aoai-realtime-audio-sdk

BradyFU / Awesome-Multimodal-Large-Language-Models

VITA-MLLM / VITA

openai / openai-realtime-console

chonkie-ai / autotiktokenizer

VITA-MLLM / Freeze-Omni

openai / openai-realtime-api-beta

pipecat-ai / pipecat

kyutai-labs / moshi

ga642381 / speech-trident

stanfordnlp / dspy

michaelhodel / arc-dsl