Skip to content
View see2023's full-sized avatar

Block or report see2023

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 16,414 1,771 Updated Sep 12, 2025

Voice activity detector (VAD) for the browser with a simple API

TypeScript 1,596 223 Updated Sep 5, 2025

A Chrome extension for focus and productivity with Pomodoro Timer and website blocking | 专注效率提升的 Chrome 插件,集成番茄钟和网站屏蔽功能

TypeScript 1 Updated Mar 22, 2025

Real-time voice assistant with multi-speaker recognition & tactical suggestions. Local AI processing for privacy-sensitive scenarios (debates/meetings/negotiations).

Dart 1 Updated Mar 5, 2025

Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).

Python 10,576 932 Updated Aug 5, 2025

基于多模态大模型的智能搜索助手,通过AI技术实现小红书平台的智能化信息检索和知识整合|An intelligent search assistant based on multimodal large models, enabling smart information retrieval and knowledge integration on the Xiaohongshu platform.

Python 22 4 Updated Nov 6, 2024

Multilingual Voice Understanding Model

Python 6,615 603 Updated Aug 15, 2025

A fast multimodal LLM for real-time voice

Python 4,194 334 Updated Sep 2, 2025

An extensive node suite that enables ComfyUI to process 3D inputs (Mesh & UV Texture, etc) using cutting edge algorithms (3DGS, NeRF, etc.)

Python 3,369 343 Updated Sep 16, 2025

[CVPR 2024 Highlight] The official repo for "GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians"

Python 876 126 Updated Jun 17, 2025

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 8,289 938 Updated Sep 16, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 24,582 1,703 Updated Sep 1, 2025

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 50,921 5,587 Updated Sep 10, 2025

Multimodal Real-time Audio-Video Chatting Intelligent Assistant

Python 7 3 Updated Nov 15, 2024

A modular high-level library to train embodied AI agents across a variety of tasks and environments.

Python 2,562 588 Updated Aug 19, 2025

Python sample codes and textbook for robotics algorithms.

Python 25,860 6,878 Updated Sep 15, 2025

Code and dataset for photorealistic Codec Avatars driven from audio

Python 2,843 278 Updated Sep 15, 2024

Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation

Jupyter Notebook 4,224 715 Updated Jun 22, 2024

基于Bert-VITS2做的表情、动画测试. Animation testing based on Bert-VITS2.

Python 536 59 Updated Aug 6, 2025

Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"

Python 18,517 2,603 Updated Oct 30, 2024

vits2 backbone with multilingual-bert

Python 8,567 1,232 Updated Sep 15, 2025

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 58,428 7,061 Updated Jun 30, 2025

[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified motion-language generation model using LLMs

Python 1,761 126 Updated Jul 1, 2025

The official implementation of the paper "Human Motion Diffusion as a Generative Prior"

Python 489 26 Updated Jan 25, 2025

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Python 15,699 1,831 Updated Jun 27, 2024

Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%

Python 4,537 374 Updated Jul 29, 2025

An Open-Ended Embodied Agent with Large Language Models

JavaScript 6,346 605 Updated Apr 3, 2024

ImageBind One Embedding Space to Bind Them All

Python 8,791 825 Updated Sep 10, 2025

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型

Python 4,164 424 Updated Aug 23, 2024

Langflow is a powerful tool for building and deploying AI-powered agents and workflows.

Python 118,687 7,602 Updated Sep 16, 2025
Next