ProtoOcc: Accurate, Efficient 3D Occupancy Prediction Using Dual Branch Encoder-Prototype Query Decoder
Jungho Kim*, Changwon Kang*, Dongyoung Lee*, Sehwan Choi, Jun Won Choi†
*: Equal Contribution, †: Corresponding Author
- [2025/07]: We released the full code & checkpoints of ProtoOcc, including nuScenes (Single & Multi frame) and SemanticKITTI.
- [2024/12]: ProtoOcc is accepted at AAAI 2025. 🔥
- [2024/08]: ProtoOcc achieves the SOTA on Occ3D-nuScenes with 45.02% mIoU (Multi-frame) and 39.56% mIoU, 12.83 FPS (Single-frame)!
Overall structure of ProtoOcc. (a) Dual Branch Encoder captures fine-grained 3D structures and models the large receptive fields in voxel and BEV domains, respectively. (b) The Prototype Query Decoder generates Scene-Aware Queries utilizing prototypes and achieves fast inference without iterative query decoding. (c) Our ProtoOcc framework integrates Dual Branch Encoder and Prototype Mask Decoder for 3D occupancy prediction.
| Config | Temporal | Backbone | Input Size | Pooling Method | mIoU | Hugging | |
|---|---|---|---|---|---|---|---|
| ProtoOcc_1key | 1 Frame | R50 | 256x704 | BEVDepth | 39.56 | link | link |
| ProtoOcc_longterm | 8 Frames | R50 | 256x704 | BEVStereo | 45.02 | link | link |
| Config | Temporal | Backbone | Input Size | Pooling Method | mIoU | Hugging | |
|---|---|---|---|---|---|---|---|
| ProtoOcc_semanticKITTI | 1 Frame | R50 | 384x1280 | BEVDepth | 13.89 | link | link |
We trained all models using four RTX 3090 (24GB) GPUs.
CONFIG=ProtoOcc_1key # (ProtoOcc_1key / ProtoOcc_longterm / ProtoOcc_semanticKITTI)
./tools/dist_train.sh projects/configs/ProtoOcc/${CONFIG}.py 4 --work-dir ./work_dirs/${CONFIG}If you want to get the pretrained weights, download them from Google Drive or Hugging Face.
To measure inference speed, uncomment # fp16 = dict(loss_scale='dynamic') in the config file.
CONFIG=ProtoOcc_1key # (ProtoOcc_1key / ProtoOcc_longterm / ProtoOcc_semanticKITTI)
bash tools/dist_test.sh ./projects/configs/${CONFIG}.py ./work_dirs/${CONFIG}/${CONFIG}.pth 1 --eval bboxxThis project builds upon several outstanding open-source projects. We gratefully acknowledge the following key contributions.
If you find this work useful for your research or projects, please consider citing the following BibTeX entry.
@inproceedings{kim2025protoocc,
title={Protoocc: Accurate, efficient 3d occupancy prediction using dual branch encoder-prototype query decoder},
author={Kim, Jungho and Kang, Changwon and Lee, Dongyoung and Choi, Sehwan and Choi, Jun Won},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={39},
number={4},
pages={4284--4292},
year={2025}
}


