Skip to content
View ankurbhatia24's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report ankurbhatia24

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

This repo contains annotated research papers that I found really good and useful

2,724 267 Updated Feb 26, 2025

Material for gpu-mode lectures

Jupyter Notebook 4,121 414 Updated Feb 9, 2025

RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Python 5 Updated Jun 27, 2022

Talking Head (3D): A JavaScript class for real-time lip-sync using Ready Player Me full-body 3D avatars.

JavaScript 488 148 Updated Mar 3, 2025

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Python 7,274 1,319 Updated Dec 6, 2023

Elucidating the Design Space of Diffusion-Based Generative Models (EDM)

Python 1,581 159 Updated Mar 16, 2024

A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

Forth 101 19 Updated Mar 25, 2023

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Python 6,948 1,024 Updated Aug 5, 2024

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

C++ 19,974 1,046 Updated Mar 26, 2025

Pytorch implementation of VQGAN (Taming Transformers for High-Resolution Image Synthesis) (https://arxiv.org/pdf/2012.09841.pdf)

Python 506 82 Updated Jul 17, 2024

A library for efficient similarity search and clustering of dense vectors.

C++ 33,913 3,808 Updated Mar 24, 2025

End-to-End Speech Processing Toolkit

Python 8,919 2,236 Updated Mar 21, 2025

A real-time video processing app written in C++ using OpenGL and FFmpeg

C++ 252 59 Updated Aug 27, 2023

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

Jupyter Notebook 955 82 Updated Nov 4, 2024

Fast and memory-efficient exact attention

Python 16,528 1,565 Updated Mar 25, 2025

FastAPI project Template generator to make your life easier 🧬 🚀

Python 342 21 Updated Mar 24, 2025

Perceptual video quality assessment based on multi-method fusion.

Python 4,840 773 Updated Mar 13, 2025

A Very Low-Bitrate Codec for Speech Compression

C++ 3,854 358 Updated Aug 20, 2024

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,632 322 Updated Jan 4, 2024

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Python 4,426 728 Updated Mar 19, 2025

Tools for handling speech data in machine learning projects.

Python 993 231 Updated Mar 19, 2025

FastAPI Best Practices and Conventions we used at our startup

11,043 814 Updated Sep 3, 2024

Experimental LDM uses of Paella's architecture

Python 34 4 Updated Jan 26, 2023

Pytorch implementation of Diffusion Models (https://arxiv.org/pdf/2006.11239.pdf)

Python 1,281 287 Updated Sep 7, 2023
Python 4 1 Updated Oct 5, 2022

Erasing Concepts from Diffusion Models

Jupyter Notebook 586 37 Updated Dec 23, 2024

Video-P2P: Video Editing with Cross-attention Control

Python 404 26 Updated Jul 20, 2024

Pytorch implementation of MaskGIT: Masked Generative Image Transformer (https://arxiv.org/pdf/2202.04200.pdf)

Python 429 35 Updated Sep 3, 2023

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 29,897 4,054 Updated Jul 17, 2024
Next