Highlights
- Pro
Stars
[ICLR 2025 (Oral 📢) ] Our OpenYOLO3D model achieves state-of-the-art performance in Open Vocabulary 3D Instance Segmentation on ScanNet200 and Replica datasets with up ∼16x speedup compared to the …
[CVPR 2023] BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects
[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
A curated publication list on open vocabulary semantic segmentation and related area (e.g. zero-shot semantic segmentation) resources..
[ECCV'24] OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT (CVPR'24), OA-CNNs (CVPR'24), MSC (CVPR'23)
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
[ICLR 2025, Oral] EmbodiedSAM: Online Segment Any 3D Thing in Real Time
Official repository for "AM-RADIO: Reduce All Domains Into One"
A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World
[NeurIPS 2024] Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Let your Claude able to think
CVPR2023 : VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud
Implementation of an efficient variant of SVRG that relies on mini-batching implemented in Pytorch
Official implementation of the paper "LangSplat: 3D Language Gaussian Splatting" [CVPR2024 Highlight]
[ICCV 2023] SGAligner: 3D Scene Alignment with Scene Graphs
EPFL Course - Optimization for Machine Learning - CS-439
OpenMMLab Pose Estimation Toolbox and Benchmark.
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
PointNet and PointNet++ implemented by pytorch (pure python) and on ModelNet, ShapeNet and S3DIS.