Stars
An Interpretable Deep Learning Approach for Morphological Script Type Analysis (IWCP 2024)
Janus-Series: Unified Multimodal Understanding and Generation Models
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
LVBench: An Extreme Long Video Understanding Benchmark
Run AI workflows with TypeScript & Vercel AI SDK
🎬 卡卡字幕助手 | VideoCaptioner - 基于 LLM 的智能字幕助手,无需GPU一键高质量字幕视频合成!视频字幕生成、断句、校正、字幕翻译全流程。让字幕制作简单高效!
Cross-platform get display info for MacOS、Windows、Linux, Like electron Display Object.
ZXing-C++ WebAssembly as an ES/CJS module with types. Read or write barcodes in various JS runtimes: Web, Node.js, Bun, and Deno.
⚡ Transfer files over wifi from your computer to your mobile device by scanning a QR code without leaving the terminal.
F-star / pixijs
Forked from pixijs/pixijsThe HTML5 Creation Engine: Create beautiful digital content with the fastest, most flexible 2D WebGL renderer.
World's first AI meeting copilot
基于Flask Web的中文自动语音识别演示系统,包含语音识别、语音合成、声纹识别之说话人识别。
SmartEraser, built with a new removing paradigm called Masked-Region Guidance. This paradigm retains the masked region in the input, using it as guidance for the removal process.
猫步简历 – 一款开源免费的简历制作神器,支持导出超高清PDF、图片、源码级JSON数据等,AI简历生成、AI润色、AI语种翻译等。提供海量在线制作模版、主题任意切换、高度定制化的简历模块。使用猫步简历,您可以制作出一份独特、优美、专业的求职简历。
DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis
A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
文本纠错工具包(Text Correct, CSC), 支持中文拼写纠错/标点符号纠错(CSC, Chinese Spelling Correct / Check; Punct), CSC支持各领域数据(包括古文), 模型在大规模、各领域的、现代/当代语料上训练而得, 泛化性强.
[SOICT 2024] LLM-Powered Video Search: A Comprehensive Multimedia Retrieval System
This repository has the code for creating Video RAG using open source models.
This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension"
An open-source AI content search engine designed specifically for content creators. Supports extraction of text, images, and short videos. Allows full local deployment (web app, RAG server, LLM ser…