Skip to content
View manmay-nakhashi's full-sized avatar
:electron:
working
:electron:
working
  • Bengaluru
  • 07:05 (UTC +05:30)

Block or report manmay-nakhashi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

MARS5 speech model (TTS) from CAMB.AI

Jupyter Notebook 2,796 246 Updated Aug 1, 2024

A generative speech model for daily dialogue.

Python 37,759 4,083 Updated Jul 6, 2025

🤢 LipSick: Fast, High Quality, Low Resource Lipsync Tool 🤮

Python 216 32 Updated Jul 16, 2024
Python 2,531 304 Updated May 19, 2024
HTML 44 1 Updated Jun 11, 2024

PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)

Python 291 23 Updated May 4, 2024

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Python 7,129 1,055 Updated Aug 5, 2024

Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.

Python 420 49 Updated Nov 26, 2024

Grok open release

Python 50,496 8,361 Updated Aug 30, 2024

🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.

Python 251 25 Updated Jun 10, 2024

End-to-end platform for building voice first multimodal agents

Python 422 114 Updated Oct 28, 2024

Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles.

Python 1,051 141 Updated Aug 24, 2025

Foundational model for human-like, expressive TTS

Python 4,159 694 Updated Jul 30, 2024

第一个支持中英文双语语音-文本多模态对话的开源可商用对话模型。便捷的语音输入将大幅改善以文本为输入的大模型的使用体验,同时避免了基于 ASR 解决方案的繁琐流程以及可能引入的错误。

Python 557 54 Updated Sep 11, 2023

Unoffical implementation of Megatts2

Python 286 38 Updated Mar 23, 2024

A ggml (C++) re-implementation of tortoise-tts

C++ 187 16 Updated Aug 20, 2024

AI powered speech denoising and enhancement

Python 1,959 231 Updated Dec 3, 2024

Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 17+ clouds, or on-prem).

Python 8,618 754 Updated Sep 7, 2025

vits2 backbone with multilingual-bert

Python 8,557 1,227 Updated Sep 6, 2025

DLAS - A configuration-driven trainer for generative models

Python 139 170 Updated Oct 11, 2022
Python 273 19 Updated Jun 8, 2024

Reading list for research topics in Sound AI

189 8 Updated Aug 8, 2024
Python 62 3 Updated Jul 25, 2024

Unsupervised Video Summarization via Successor Embeddings

Jupyter Notebook 4 1 Updated Jan 22, 2024

Unofficial implementation of miipher

Python 131 19 Updated Apr 19, 2024
Python 60 6 Updated Nov 4, 2023

The Open Source Code of UniAudio

Python 576 36 Updated Jul 22, 2024

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

Jupyter Notebook 366 39 Updated Jul 12, 2024

Easy-to-Use Speech MOS predictors

Python 311 18 Updated Oct 24, 2023