Stars
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
real time face swap and one-click video deepfake with only a single image
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Making large AI models cheaper, faster and more accessible
提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
A generative speech model for daily dialogue.
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker/Zotero
OCR, layout analysis, reading order, table recognition in 90+ languages
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.
Janus-Series: Unified Multimodal Understanding and Generation Models
Question and Answer based on Anything.
👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search…
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
An open source implementation of CLIP.
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组