Skip to content
View JiJiJiang's full-sized avatar
  • Tencent Meeting, Tencent
  • Shenzhen, China

Block or report JiJiJiang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Awesome speech/audio LLMs, representation learning, and codec models

795 48 Updated Dec 21, 2024
Python 9 2 Updated Jul 16, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 32,662 4,971 Updated Dec 27, 2024

Official repository for Mamba-based Segmentation Model for Speaker Diarization

Python 27 3 Updated Oct 10, 2024

wenet_LLM_from_ASLP

Python 4 Updated Nov 26, 2024

The official Meta Llama 3 GitHub site

Python 27,652 3,157 Updated Aug 12, 2024

Inference code for Llama models

Python 56,975 9,632 Updated Aug 18, 2024

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,688 185 Updated Nov 14, 2024

real time face swap and one-click video deepfake with only a single image

Python 41,895 6,152 Updated Dec 26, 2024

✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM

Python 1,079 64 Updated Dec 27, 2024

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 16,802 1,664 Updated Dec 19, 2024

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 11,010 697 Updated Dec 17, 2024

MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation

Python 844 207 Updated Mar 10, 2024
Python 7,051 550 Updated Dec 20, 2024

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,227 290 Updated Nov 5, 2024

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,341 92 Updated Aug 13, 2024

A beautiful, simple, clean, and responsive Jekyll theme for academics

HTML 11,642 11,392 Updated Dec 26, 2024

The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"

Python 127 6 Updated Nov 14, 2024

Target Speaker Extraction Toolkit

Python 137 15 Updated Nov 6, 2024

基于ChatGLM-6B、ChatGLM2-6B、ChatGLM3-6B模型,进行下游具体任务微调,涉及Freeze、Lora、P-tuning、全参微调等

Python 2,693 301 Updated Dec 12, 2023

We Speech Transcript based on LLM, in 300 lines of code.

Python 132 12 Updated Dec 17, 2024

Official Repository For VoxBlink2

Python 55 4 Updated Aug 13, 2024

A book about Text-to-Speech (TTS) in Chinese.

TeX 589 80 Updated Apr 19, 2022

Transformer: PyTorch Implementation of "Attention Is All You Need"

Python 3,193 455 Updated Aug 6, 2024

Faster Whisper transcription with CTranslate2

Python 13,145 1,102 Updated Dec 23, 2024

A pytorch implementation of the paper "ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding"

Shell 50 2 Updated Sep 19, 2024

Speech, Language, Audio, Music Processing with Large Language Model

Python 622 56 Updated Dec 27, 2024

Different implementations of "Weighted Prediction Error" for speech dereverberation

Python 495 164 Updated Sep 10, 2024

Praat: Doing Phonetics By Computer

C 1,544 243 Updated Dec 20, 2024
Python 120 26 Updated Jul 21, 2021
Next