-
Courant Institute of Mathematical Sciences, New York University
- NYC
-
12:24
(UTC -04:00) - sainingxie.com
- @sainingxie
Highlights
- Pro
Stars
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
Official Pytorch Implementation of Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
A viewer for json files exported from Slack workspaces.
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Ongoing research training gaussian splatting at scale by distributed system
NASA/IBM HLS Foundation Model for downstream applications on Mars imagery
(ECCV 2024) Code for V-IRL: Grounding Virtual Intelligence in Real Life
[CVPR 2024] Official PyTorch implementation of SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering
Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"
PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"
An Instruction-tuned Audio-Visual Language Model for Hate Content Detection
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
A modern, highly customizable, responsive Jekyll template for course websites.
✨✨Latest Advances on Multimodal Large Language Models
Zoomable, animated scatterplots in the browser that scales over a billion points
Painter & SegGPT Series: Vision Foundation Models from BAAI
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language
Official Open Source code for "Scaling Language-Image Pre-training via Masking"
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV 2023)
The simplest, fastest repository for training/finetuning medium-sized GPTs.