- Hong Kong, China
Highlights
- Pro
Stars
A small utility to modify the dynamic linker and RPATH of ELF executables
🎁 5,400,000+ Unsplash images made available for research and machine learning
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
Official implementation for "Break-A-Scene: Extracting Multiple Concepts from a Single Image" [SIGGRAPH Asia 2023]
A plugin for Mac WeChat
collection of diffusion model papers categorized by their subareas
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation"
The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models.
[Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enlarged hidden dimension to build super frontier vision languag…
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
Writing AI Conference Papers: A Handbook for Beginners
Open-MAGVIT2: Democratizing Autoregressive Visual Generation
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
Mundana is a free Jekyll theme, Medium styled.
High-Resolution Image Synthesis with Latent Diffusion Models
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Easy demo for finetuning a pre-trained Stable Diffusion XL with LoRA using the collected fashion dataset from scratch.
[ICLR 2024 Spotlight] DreamLLM: Synergistic Multimodal Comprehension and Creation
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curatio…
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
This is the official implementation of "Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams"