-
Adobe Research
- Canberra, Australia
- http://www.yiconghong.me/
- @YicongHong
Highlights
- Pro
Stars
Asynchronous Blob Tracker for Event Cameras. 2024 IEEE Transactions on Robotics (TRO).
[NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models
A Triton Kernel for incorporating Bi-Directionality in Mamba2
A PyTorch implementation of the paper "ZigMa: A DiT-Style Mamba-based Diffusion Model" (ECCV 2024)
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
Hackable and optimized Transformers building blocks, supporting a composable construction.
Implementation of MagViT2 Tokenizer in Pytorch
Official repository of "BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment"
Superduper: Build end-to-end AI applications and agent workflows on your existing data infrastructure and preferred tools - without migrating your data.
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering
An open-source impl. of Large Reconstruction Models
Generative Models by Stability AI
[NeurIPS 2023] Scalable 3D Captioning with Pretrained Models
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
[AAAI 2024] Official implementation of NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models
Code and Data for Paper: PanoGen: Text-Conditioned Panoramic Environment Generation for Vision-and-Language Navigation
[ICCV 2023 Oral]: Scaling Data Generation in Vision-and-Language Navigation
A latent text-to-image diffusion model
[TPAMI 2024] Official repo of "ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments"
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV 2023)