Lists (30)
Sort Name ascending (A-Z)
3D-Synthesis
AIGC-Video
Audio
Data
Depth
Detection+Segmentation
Distill
Embodiment
Face
Fake/Defake
GAN
Hardware/Accelerate
Image Quality
图像/视频质量评估KG
KnowledgeGraph
LLM
LLM-Agent
LLM-Code
Multi-Modal
Pretraining-CV
ReinforceLearning
StableDiffusion
TimeSeries
Tracking
Translation
Video
VLM
机器学习工具及平台
重要信息汇集
非机器学习工具及平台
Starred repositories
🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.
osanseviero / geminiCoder
Forked from Nutlope/llamacoderCreate apps with Gemini
Learning Flow Fields in Attention for Controllable Person Image Generation
A self-hostable bookmark-everything app (links, notes and images) with AI-based automatic tagging and full text search
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought
Official code, datasets and checkpoints for "Timer: Generative Pre-trained Transformers Are Large Time Series Models" (ICML 2024)
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker
A generative world for general-purpose robotics & embodied AI learning.
A collection of 🤗 Transformers.js demos and example applications
SiteOne Crawler is a cross-platform website crawler and analyzer for SEO, security, accessibility, and performance optimization—ideal for developers, DevOps, QA engineers, and consultants. Supports…
Inference engine powering open source models on OpenRouter
Python tool for converting files and office documents to Markdown.
The first behavioral foundation model to control a virtual physics-based humanoid agent for a wide range of whole-body tasks.
Open and efficient video watermarking
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Cyberduck is a libre FTP, SFTP, WebDAV, Amazon S3, Backblaze B2, Microsoft Azure & OneDrive and OpenStack Swift file transfer client for Mac and Windows.
TEN Agent is a conversational AI powered by the TEN, integrating Gemini 2.0 Live, OpenAI Realtime, RTC, and more. It delivers real-time capabilities to see, hear, and speak, while being fully compa…
The leading open-source AI copilot for JetBrains. Connect to any model in any environment, and customize your coding experience in any way you like.
A very quick project that transforms research papers into engaging three-person discussions, offering an intuitive and thought-provoking listening experience. Perfect for podcast enthusiasts seekin…
Memory-Guided Diffusion for Expressive Talking Video Generation
A web-based tool for visualizing and exploring artifacts from Microsoft's GraphRAG.
Official repository of "TryOffDiff: Virtual-Try-Off via High-Fidelity Garment Reconstruction using Diffusion Models".
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
proxychains ng (new generation) - a preloader which hooks calls to sockets in dynamically linked programs and redirects it through one or more socks/http proxies. continuation of the unmaintained p…