- China
Highlights
- Pro
Lists (9)
Sort Last updated
Starred repositories
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
Clean, accessible reproduction of DeepSeek R1-Zero
Official repo and evaluation implementation of VSI-Bench
π up-to-date & curated list of awesome 3D Visual Grounding papers, methods & resources.
Improving 3D Large Language Model via Robust Instruction Tuning
π€ LeRobot: Making AI for Robotics more accessible with end-to-end learning
A Vision-Language Model for Spatial Affordance Prediction in Robotics
Official Task Suite Implementation of ICML'23 Paper "VIMA: General Robot Manipulation with Multimodal Prompts"
A Survey of Embodied Learning for Object-Centric Robotic Manipulation
Official repo of ICLR'25 - LLaRA: Large Language and Robotics Assistant
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
[Embodied-AI-Survey-2024] Paper list and projects for Embodied AI
Implementation of Autoregressive Diffusion in Pytorch
STAR: Scale-wise Text-to-image generation via Auto-Regressive representations
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"
Utilities intended for use with Llama models.
π° Must-read papers and blogs on LLM based Long Context Modeling π₯
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
π₯π₯π₯ A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).