Stars
📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).
You like pytorch? You like micrograd? You love tinygrad! ❤️
Paper reading notes on Deep Learning and Machine Learning
[IEEE T-PAMI 2024] All you need for End-to-end Autonomous Driving
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)
We write your reusable computer vision tools. 💜
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
A curated list of foundation models for vision and language tasks
Awesome papers & datasets specifically focused on long-term videos.
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
An open-source framework for training large multimodal models.
🎢 Creating and sharing simulation environments for embodied and synthetic data research
Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
Visual tracking library based on PyTorch.
An on-going paper list on new trends in 3D vision with deep learning
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
The Replica Dataset v1 as published in https://arxiv.org/abs/1906.05797 .
A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.
PointTrack (ECCV2020 ORAL): Segment as Points for Efficient Online Multi-Object Tracking and Segmentation
The official PyTorch implementation of the paper "Learning by Analogy: Reliable Supervision from Transformations for Unsupervised Optical Flow Estimation".