- Beijing
-
18:12
(UTC +08:00) - www.zhengyinhe.com
Stars
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
A suite of image and video neural tokenizers
A PyTorch native library for large model training
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Vector (and Scalar) Quantization, in Pytorch
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)
A series of math-specific large language models of our Qwen2 series.
Ongoing research training transformer models at scale
[ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
A feature-rich command-line audio/video downloader
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"
OpenAI compatible API for TensorRT LLM triton backend
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation: