whatissimondoing

Simon Lee whatissimondoing

4 followers · 0 following

Fudan University
Shanghai, China

Achievements

Stars

112 results for source starred repositories

Clear filter

OpenMOSS / SpeechGPT-2.0-preview

Python 167 11 Updated Jan 27, 2025

DS4SD / docling

Get your documents ready for gen AI

Python 19,586 1,049 Updated Jan 31, 2025

CAS-SIAT-XinHai / CPsyCoun

[ACL 2024] CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling

Jupyter Notebook 87 14 Updated Sep 25, 2024

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,780 187 Updated Nov 14, 2024

wntg / LLaMA-Omni

llama-omni训练代码复现

Python 41 6 Updated Jan 23, 2025

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,106 270 Updated Nov 5, 2024

luka-group / mDPO

[EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.

Python 60 1 Updated Nov 10, 2024

Labbeti / aac-metrics

Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.

Python 40 3 Updated Jan 20, 2025

ZhangXInFD / SpeechTokenizer

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Python 525 45 Updated Jun 9, 2024

OpenMOSS / AnyGPT

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

Python 822 66 Updated Aug 27, 2024

hzwer / WritingAIPaper

Writing AI Conference Papers: A Handbook for Beginners

1,849 66 Updated Dec 23, 2024

EMOsuperb / EMO-SUPERB-submission

EMO-SUPERB submission

Python 42 2 Updated Sep 4, 2024

HIT-SCIR-SC / QiaoBan

Python 190 20 Updated Jan 31, 2024

SmartFlowAI / EmoLLM

心理健康大模型、LLM、The Big Model of Mental Health、Finetune、InternLM2、InternLM2.5、Qwen、ChatGLM、Baichuan、DeepSeek、Mixtral、LLama3、GLM4、Qwen2、LLama3.1

Python 1,074 144 Updated Jan 16, 2025

argilla-io / distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Python 2,185 160 Updated Jan 30, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 8,300 811 Updated Feb 2, 2025

nilaoda / BBDown

Bilibili Downloader. 一个命令行式哔哩哔哩下载器.

C# 10,444 1,313 Updated Jan 21, 2025

thuhcsi / SECap

Python 149 13 Updated Jul 9, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 34,040 3,688 Updated Jan 25, 2025

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 8,405 640 Updated Jan 23, 2025

fishaudio / fish-speech

SOTA Open Source TTS

Python 18,761 1,419 Updated Jan 26, 2025

microsoft / MInference

[NeurIPS'24 Spotlight, ICLR'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an …

Python 894 43 Updated Jan 31, 2025

infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Python 30,699 2,874 Updated Feb 1, 2025

xhluca / bm25s

Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy

Python 992 49 Updated Jan 16, 2025

ddlBoJack / emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Python 721 55 Updated Dec 23, 2024

emo-box / EmoBox

[INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark

Python 190 8 Updated Jun 17, 2024

ShishirPatil / gorilla

Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)

Python 11,730 1,037 Updated Feb 2, 2025

Hannibal046 / xRAG

[Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token

Jupyter Notebook 112 9 Updated Jul 4, 2024

chenfei-wu / TaskMatrix

Python 34,540 3,307 Updated Jan 6, 2024

Zejun-Yang / AniPortrait

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Python 4,794 598 Updated Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simon Lee whatissimondoing

Achievements

Achievements

Block or report whatissimondoing

Stars

OpenMOSS / SpeechGPT-2.0-preview

DS4SD / docling

CAS-SIAT-XinHai / CPsyCoun

ictnlp / LLaMA-Omni

wntg / LLaMA-Omni

gpt-omni / mini-omni

luka-group / mDPO

Labbeti / aac-metrics

ZhangXInFD / SpeechTokenizer

OpenMOSS / AnyGPT

hzwer / WritingAIPaper

EMOsuperb / EMO-SUPERB-submission

HIT-SCIR-SC / QiaoBan

SmartFlowAI / EmoLLM

argilla-io / distilabel

sgl-project / sglang

nilaoda / BBDown

thuhcsi / SECap

2noise / ChatTTS

open-mmlab / Amphion

fishaudio / fish-speech

microsoft / MInference

infiniflow / ragflow

xhluca / bm25s

ddlBoJack / emotion2vec

emo-box / EmoBox

ShishirPatil / gorilla

Hannibal046 / xRAG

chenfei-wu / TaskMatrix

Zejun-Yang / AniPortrait