Skip to content
View undeadyequ's full-sized avatar

Block or report undeadyequ

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!

Python 755 80 Updated Jan 16, 2025

Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.

Python 5,733 945 Updated Nov 21, 2024

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

Python 381 42 Updated Sep 13, 2024

PyTorch Implementation of "Monotonic Chunkwise Attention" (ICLR 2018)

Python 80 20 Updated Apr 2, 2018

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 6,751 601 Updated May 31, 2024

Unoffical implementation of Megatts2

Python 274 35 Updated Mar 23, 2024

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 13,499 1,464 Updated Jan 25, 2025
Jupyter Notebook 7,975 563 Updated Jun 16, 2024

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Jupyter Notebook 4,036 364 Updated Dec 18, 2024

Public repo for HF blog posts

Jupyter Notebook 2,545 795 Updated Jan 25, 2025

PyTorch implementation of ``Masked-Attention Diffusion Guidance for Spatially Controlling Text-to-Image Generation'' [The Visual Computer]

Python 21 Updated Jan 7, 2025

This list of writing prompts covers a range of topics and tasks, including brainstorming research ideas, improving language and style, conducting literature reviews, and developing research plans.

3,535 313 Updated Jan 25, 2024

Source code for "On the Relationship between Self-Attention and Convolutional Layers"

Python 1,095 127 Updated Jan 10, 2023

pix2tex: Using a ViT to convert images of equations into LaTeX code.

Python 13,319 1,069 Updated Jan 18, 2025

A family of diffusion models for text-to-audio generation.

Python 1,136 94 Updated Dec 31, 2024

Multimodal AI Story Teller, built with Stable Diffusion, GPT, and neural text-to-speech

Python 512 64 Updated Aug 29, 2023

Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'

Jupyter Notebook 127 23 Updated Jan 6, 2025

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech (INTERSPEECH 2022)

Python 116 17 Updated Jan 24, 2023

Interface for Controllable Expressive Talking Machine

Python 38 8 Updated Jan 17, 2024

📖🎧 A tool for creating ebooks with synchronized text and audio (EPUB3 with Media Overlays)

HTML 283 27 Updated Jan 2, 2024

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 27,096 3,413 Updated Jul 23, 2024

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

Python 530 105 Updated May 1, 2023
Python 74 8 Updated May 19, 2022

Displays text in sync with audio being played. Works with VTT files.

JavaScript 42 7 Updated Mar 12, 2018

A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

Python 366 71 Updated Dec 8, 2022

Implementation of the model used in the paper Protest Activity Detection and Perceived Violence Estimation from Social Media Images (ACM Multimedia 2017)

Jupyter Notebook 181 46 Updated Mar 21, 2024

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…

Python 45,983 7,960 Updated Jan 22, 2025
Next