Starred repositories
Python tool for converting files and office documents to Markdown.
[Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
GenEval: An object-focused framework for evaluating text-to-image alignment
Scalable data pre processing and curation toolkit for LLMs
OpenMMLab Detection Toolbox and Benchmark
官方推荐的 ChatTTS 资源汇总项目,整理了全网相关资源和常见问题 || Officially recommended ChatTTS resource collection project
A generative speech model for daily dialogue.
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊
DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.
Comfortably monitor your Internet traffic 🕵️♂️
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
An easy-to-use, fast, and easily integrable tool for evaluating audio LLM
Data and Code for Program of Thoughts (TMLR 2023)
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
MINT-1T: A one trillion token multimodal interleaved dataset.
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
A fast, secure, and portable multichain light client for Ethereum
A feature-rich command-line audio/video downloader
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
Towards Large Multimodal Models as Visual Foundation Agents
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)