see2023

see2023

11 followers · 0 following

Stars

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 16,414 1,771 Updated Sep 12, 2025

ricky0123 / vad

Voice activity detector (VAD) for the browser with a simple API

TypeScript 1,596 223 Updated Sep 5, 2025

see2023 / SeeZen

A Chrome extension for focus and productivity with Pomodoro Timer and website blocking | 专注效率提升的 Chrome 插件，集成番茄钟和网站屏蔽功能

TypeScript 1 Updated Mar 22, 2025

see2023 / VoiceMind

Real-time voice assistant with multi-speaker recognition & tactical suggestions. Local AI processing for privacy-sensitive scenarios (debates/meetings/negotiations).

Dart 1 Updated Mar 5, 2025

microsoft / TRELLIS

Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).

Python 10,576 932 Updated Aug 5, 2025

基于多模态大模型的智能搜索助手，通过AI技术实现小红书平台的智能化信息检索和知识整合|An intelligent search assistant based on multimodal large models, enabling smart information retrieval and knowledge integration on the Xiaohongshu platform.

Python 22 4 Updated Nov 6, 2024

FunAudioLLM / SenseVoice

Multilingual Voice Understanding Model

Python 6,615 603 Updated Aug 15, 2025

fixie-ai / ultravox

A fast multimodal LLM for real-time voice

Python 4,194 334 Updated Sep 2, 2025

MrForExample / ComfyUI-3D-Pack

An extensive node suite that enables ComfyUI to process 3D inputs (Mesh & UV Texture, etc) using cutting edge algorithms (3DGS, NeRF, etc.)

Python 3,369 343 Updated Sep 16, 2025

ShenhanQian / GaussianAvatars

[CVPR 2024 Highlight] The official repo for "GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians"

Python 876 126 Updated Jun 17, 2025

pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 8,289 938 Updated Sep 16, 2025

QwenLM / Qwen3

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 24,582 1,703 Updated Sep 1, 2025

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 50,921 5,587 Updated Sep 10, 2025

see2023 / iSee-server

Multimodal Real-time Audio-Video Chatting Intelligent Assistant

Python 7 3 Updated Nov 15, 2024

facebookresearch / habitat-lab

A modular high-level library to train embodied AI agents across a variety of tasks and environments.

Python 2,562 588 Updated Aug 19, 2025

AtsushiSakai / PythonRobotics

Python sample codes and textbook for robotics algorithms.

Python 25,860 6,878 Updated Sep 15, 2025

facebookresearch / audio2photoreal

Code and dataset for photorealistic Codec Avatars driven from audio

Python 2,843 278 Updated Sep 15, 2024

MarkFzp / mobile-aloha

Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation

Jupyter Notebook 4,224 715 Updated Jun 22, 2024

see2023 / Bert-VITS2-ext

基于Bert-VITS2做的表情、动画测试. Animation testing based on Bert-VITS2.

Python 536 59 Updated Aug 6, 2025

graphdeco-inria / gaussian-splatting

Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"

Python 18,517 2,603 Updated Oct 30, 2024

fishaudio / Bert-VITS2

vits2 backbone with multilingual-bert

Python 8,567 1,232 Updated Sep 15, 2025

FoundationAgents / MetaGPT

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 58,428 7,061 Updated Jun 30, 2025

OpenMotionLab / MotionGPT

[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified motion-language generation model using LLMs

Python 1,761 126 Updated Jul 1, 2025

priorMDM / priorMDM

The official implementation of the paper "Human Motion Diffusion as a Generative Prior"

Python 489 26 Updated Jan 25, 2025

zai-org / ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Python 15,699 1,831 Updated Jun 27, 2024

kyegomez / tree-of-thoughts

Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%

Python 4,537 374 Updated Jul 29, 2025

MineDojo / Voyager

An Open-Ended Embodied Agent with Large Language Models

JavaScript 6,346 605 Updated Apr 3, 2024

facebookresearch / ImageBind

ImageBind One Embedding Space to Bind Them All

Python 8,791 825 Updated Sep 10, 2025

zai-org / VisualGLM-6B

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型

Python 4,164 424 Updated Aug 23, 2024

langflow-ai / langflow

Langflow is a powerful tool for building and deploying AI-powered agents and workflows.

Python 118,687 7,602 Updated Sep 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly