English | 简体中文
🌏 WorldBench Team
![]() |
|---|
- This work presents
WorldLens, a unified benchmark encompassing evaluations on$^1$ Generation,$^2$ Reconstruction,$^3$ Action-Following,$^4$ Downstream Task, and$^5$ Human Preference, across a total of 24 dimensions spanning visual realism, geometric consistency, functional reliability, and perceptual alignment. - We observe no single model dominates across all axes, highlighting the need for balanced progress toward physically and behaviorally realistic world modeling.
- For additional visual examples, kindly refer to our 🌏 Project Page.
If you find this work helpful for your research, please kindly consider citing our papers:
@article{worldlens,
title = {{WorldLens}: Full-Spectrum Evaluations of Driving World Models in Real World},
author = {Ao Liang and Lingdong Kong and Tianyi Yan and Hongsi Liu and Wesley Yang and Ziqi Huang and Wei Yin and Jialong Zuo and Yixuan Hu and Dekai Zhu and Dongyue Lu and Youquan Liu and Guangfeng Jiang and Linfeng Li and Xiangtai Li and Long Zhuo and Lai Xing Ng and Benoit R. Cottereau and Changxin Gao and Liang Pan and Wei Tsang Ooi and Ziwei Liu},
journal = {arXiv preprint arXiv:2512.10958},
year = {2025}
}@article{survey_3d_4d_world_models,
title = {{3D} and {4D} World Modeling: A Survey},
author = {Lingdong Kong and Wesley Yang and Jianbiao Mei and Youquan Liu and Ao Liang and Dekai Zhu and Dongyue Lu and Wei Yin and Xiaotao Hu and Mingkai Jia and Junyuan Deng and Kaiwen Zhang and Yang Wu and Tianyi Yan and Shenyuan Gao and Song Wang and Linfeng Li and Liang Pan and Yong Liu and Jianke Zhu and Wei Tsang Ooi and Steven C. H. Hoi and Ziwei Liu},
journal = {arXiv preprint arXiv:2509.07996},
year = {2025}
}- [12/2025] - The official ⚖️ WorldLens Leaderboard is online at HuggingFace Spaces. We invite researchers and practitioners to submit their models for evaluation on the leaderboard, enabling consistent comparison and supporting progress in world model research.
- [12/2025] - A collection of 3D and 4D world models is avaliable at 🤗
awesome-3d-4d-world-models. - [12/2025] - The Project Page is online. 🚀
- WorldLens Benchmark
- WorldLens Leaderboard
- Installation
- Data Preparation
- Getting Started
- WorldLens-26K
- WorldLens-Agent
- TODO List
- License
- Acknowledgements
![]() |
|---|
-
Generative world models must go beyond visual realism to achieve geometric consistency, physical plausibility, and functional reliability.
WorldLensis a unified benchmark that evaluates these capabilities across five complementary aspects - from low-level appearance fidelity to high-level behavioral realism. -
Each aspect is decomposed into fine-grained, interpretable dimensions, forming a comprehensive framework that bridges human perception, physical reasoning, and downstream utility.
For additional details and visual examples, kindly refer to our 📚 Paper and 🌏 Project Page.
An interactive ⚖️ WorldLens Leaderboard is online at 🤗 HuggingFace Spaces. We invite researchers and practitioners to submit their models for evaluation on the leaderboard, enabling consistent comparison and supporting progress in world model research.
Benchmarked Models
- MagicDrive, ICLR 2023.
- Panacea, CVPR 2024.
- DreamForge, arXiv 2024.
- DriveDreamer-2, AAAI 2025.
- DrivingSphere, CVPR 2025.
- OpenDWM, CVPR 2025.
- MagicDrive-V2, ICCV 2025.
- DiST-4D, ICCV 2025.
- RLGF, NeurIPS 2025.
- X-Scene, NeurIPS 2025.
- . . .
The WorldLens evaluation toolkit is developed and tested under Python 3.9 + CUDA 11.8. We recommend using Conda to manage the environment.
- Create Environment:
conda create -n worldbench python=3.9.20
conda activate worldbench- Install PyTorch:
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 \
--index-url https://download.pytorch.org/whl/cu118- Install MMCV (with CUDA):
cd worldbench/third_party/mmcv-1.6.0
MMCV_WITH_OPS=1 pip install -e .Note: We modified the C++ standard to C++17 for better compatibility. You may adjust it in worldbench/third_party/mmcv-1.6.0/setup.py based on your system.
- Install MMSegmentation:
pip install https://github.com/open-mmlab/mmsegmentation/archive/refs/tags/v0.30.0.zip- Install MMDetection:
pip install mmdet==2.28.2- Install BEVFusion-based MMDet3D:
git clone --recursive https://github.com/worldbench/dev-evalkit.git
cd worldbench/third_party/bevfusion
python setup.py developAdditional Notes:
- C++ standard was updated to C++17.
- We modified the sparse convolution import logic at
worldbench/third_party/bevfusion/mmdet3d/ops/spconv/conv.py.
- Install MMDetection3D (v1.0.0rc6):
cd worldbench/third_party/mmdetection3d-1.0.0rc6
pip install -v -e .Required dependency versions:
numpy == 1.23.5
numba == 0.53.0- Pretrained Models
WorldLens relies on several pretrained models (e.g., CLIP, segmentation, depth networks). Please download them from HuggingFace and place them under:
./pretrained_models/
Here we take nuScenes as an example. Required Files:
- nuScenes official dataset
- 12 Hz interpolated annotations from ECCV 2024 Workshop – CODA Track 2
- Tracking & temporal .pkl files from HuggingFace – WorldLens Data Preparation
Final Directory Structure
data
├── nuscenes
│ ├── can_bus
│ ├── lidarseg
│ ├── maps
│ ├── occ3d
│ ├── samples
│ ├── sweeps
│ ├── v1.0-mini
│ └── v1.0-trainval
├── nuscenes_map_aux_12Hz_interp
│ └── val_200x200_12Hz_interp.h5
├── nuscenes_mmdet3d-12Hz
│ ├── nuscenes_interp_12Hz_dbinfos_train.pkl
│ ├── nuscenes_interp_12Hz_infos_track2_eval.pkl
│ ├── nuscenes_interp_12Hz_infos_train.pkl
│ └── nuscenes_interp_12Hz_infos_val.pkl
├── nuscenes_mmdet3d-12Hz_description
│ ├── nuscenes_interp_12Hz_updated_description_train.pkl
│ └── nuscenes_interp_12Hz_updated_description_val.pkl
├── nuscenes_mmdet3d_2
│ └── nuscenes_infos_temporal_val_3keyframes.pkl
└── nuscenes_track
├── ada_track_infos_train.pkl
└── ada_track_infos_val.pkl- Configure Metrics:
All evaluation metrics are defined in a unified YAML format under tools/configs/.
Example: Temporal (Depth) Consistency:
temporal_consistency:
- name: temporal_consistency
method_name: ${method_name}
need_preprocessing: true
repeat_times: 1
local_save_path: pretrained_models/clip/ViT-B-32.pt- Run Evaluation:
bash tools/scripts/evaluate.sh $TASK $METHOD_NAME- Example: evaluating MagicDrive (video-based world model)
bash tools/scripts/evaluate.sh videogen magicdrive- Prepare Generated Results: Download model outputs from HuggingFace and move them to:
./generated_results
├── dist4d
├── dreamforge
├── drivedreamer2
├── gt
├── magicdrive
├── opendwm
└── xscene
└── video_submission-
Visualization Tools
- Multi-view Panorama Viewer (Cross-view Consistency):
python tools/showcase/video_multi_view_app.py
- Method-to-Method Comparison:
python tools/showcase/video_method_compare_app.py
- GIF-based Comparison:
python tools/showcase/gif_method_compare_app.py
To be updated.
To be updated.
- Initial release. 🚀
- Release the WorldLens-26K dataset.
- Support additional datasets (Waymo, Argoverse, and more)
- Add agent-based automatic evaluators
- . . .
This work is under the Apache License Version 2.0, while some specific implementations in this codebase might be under other licenses. Kindly refer to LICENSE.md for a more careful check, if you are using our code for commercial matters.
To be added.
| 😎 Awesome | Projects |
|---|---|
![]() |
3D and 4D World Modeling: A Survey [GitHub Repo] - [Project Page] - [Paper] |
![]() |
VBench: Comprehensive Benchmark Suite for Video Generative Models [GitHub Repo] - [Project Page] - [Paper] |
![]() |
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models [GitHub Repo] - [Project Page] - [Paper] |
![]() |
LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences [GitHub Repo] - [Project Page] - [Paper] |
![]() |
3EED: Ground Everything Everywhere in 3D [GitHub Repo] - [Project Page] - [Paper] |
![]() |
Are VLMs Ready for Autonomous Driving? A Study from Reliability, Data & Metric Perspectives [GitHub Repo] - [Project Page] - [Paper] |
![]() |
Perspective-Invariant 3D Object Detection [GitHub Repo] - [Project Page] - [Paper] |
![]() |
DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes [GitHub Repo] - [Project Page] - [Paper] |










