Skip to content

SPA-junghokim/ProtoOcc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ProtoOcc: Accurate, Efficient 3D Occupancy Prediction Using Dual Branch Encoder-Prototype Query Decoder

Jungho Kim*, Changwon Kang*, Dongyoung Lee*, Sehwan Choi, Jun Won Choi†
*: Equal Contribution, †: Corresponding Author

South Korea

AAAI 2025

arXiv AAAI

🔔 News

  • [2025/07]: We released the full code & checkpoints of ProtoOcc, including nuScenes (Single & Multi frame) and SemanticKITTI.
  • [2024/12]: ProtoOcc is accepted at AAAI 2025. 🔥
  • [2024/08]: ProtoOcc achieves the SOTA on Occ3D-nuScenes with 45.02% mIoU (Multi-frame) and 39.56% mIoU, 12.83 FPS (Single-frame)!

📽️ Demo

demo

💡 Method

inference.jpg

Overall structure of ProtoOcc. (a) Dual Branch Encoder captures fine-grained 3D structures and models the large receptive fields in voxel and BEV domains, respectively. (b) The Prototype Query Decoder generates Scene-Aware Queries utilizing prototypes and achieves fast inference without iterative query decoding. (c) Our ProtoOcc framework integrates Dual Branch Encoder and Prototype Mask Decoder for 3D occupancy prediction.

⚡ Main Result

inference.jpg

nuScenes Result

Config Temporal Backbone Input Size Pooling Method mIoU Google Hugging
ProtoOcc_1key 1 Frame R50 256x704 BEVDepth 39.56 link link
ProtoOcc_longterm 8 Frames R50 256x704 BEVStereo 45.02 link link

Semantic-KITTI Result

Config Temporal Backbone Input Size Pooling Method mIoU Google Hugging
ProtoOcc_semanticKITTI 1 Frame R50 384x1280 BEVDepth 13.89 link link

📚 Training & Evaluation

Training

We trained all models using four RTX 3090 (24GB) GPUs.

CONFIG=ProtoOcc_1key # (ProtoOcc_1key / ProtoOcc_longterm / ProtoOcc_semanticKITTI)

./tools/dist_train.sh projects/configs/ProtoOcc/${CONFIG}.py 4 --work-dir ./work_dirs/${CONFIG}

Evaluation

If you want to get the pretrained weights, download them from Google Drive or Hugging Face.
To measure inference speed, uncomment # fp16 = dict(loss_scale='dynamic') in the config file.

CONFIG=ProtoOcc_1key # (ProtoOcc_1key / ProtoOcc_longterm / ProtoOcc_semanticKITTI)

bash tools/dist_test.sh ./projects/configs/${CONFIG}.py ./work_dirs/${CONFIG}/${CONFIG}.pth 1 --eval bboxx

🙏 Acknowledgement

This project builds upon several outstanding open-source projects. We gratefully acknowledge the following key contributions.

📃 Bibtex

If you find this work useful for your research or projects, please consider citing the following BibTeX entry.

@inproceedings{kim2025protoocc,
  title={Protoocc: Accurate, efficient 3d occupancy prediction using dual branch encoder-prototype query decoder},
  author={Kim, Jungho and Kang, Changwon and Lee, Dongyoung and Choi, Sehwan and Choi, Jun Won},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={39},
  number={4},
  pages={4284--4292},
  year={2025}
}

About

[AAAI 2025] ProtoOcc: Accurate, Efficient 3D Occupancy Prediction Using Dual Branch Encoder-Prototype Query Decoder

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors