Stars
Fully open reproduction of DeepSeek-R1
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
HunyuanVideo: A Systematic Framework For Large Video Generation Model
WebGL Three.js Cesium.js Examples And Demo - WebGL 的 Three.js 和 Cesium.js 案例 --- Star ---点星星
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
Extract hardcoded (burned-in) subtitles from videos using the Tesseract OCR engine with Python.
Code for examining the use of the Wave-U-Net pre-trained model for separation of music instruments from a song, for the task of speech enhancement – separating speech and noise from a noisy speech …
adopted from asteroid; will add music separation, speech extraction and wavesplit.
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
2024最新微信去水印小程序源码(前端微信小程序后端python的django框架)抖音去水印、快手去水印、微视去水印、头条去水印、火山去水印、小红书去水印
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation
Towards Interactive and Learnable Cooperative Driving Automation: a Large Language Model-Driven Decision-making Framework
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
基于autojs6实现抖音定时发送消息、表情到指定好友以实现单方面续火花;无需Root、支持解锁、自动关闭部分弹窗
AutoJs,AutoX,抖音养号助手(自动点赞、关注、评论),快手好友互动助手(自动打招呼),抖音脚本,快手脚本
Android Automation and Remote Running Server Building Based on Autojs
✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
基于InternLm chat 7B大模型基座,构建一个Agent ,可以调用 MMYOLO 工具来完成图像内视觉任务
Android real-time display control software