Depth AnyEvent: A Cross-Modal Distillation Paradigm for Event-Based Monocular Depth Estimation (ICCV 2025)

🚨 This repository will contain download links to our evaluation code, and trained deep models of our work "Depth AnyEvent: A Cross-Modal Distillation Paradigm for Event-Based Monocular Depth Estimation", ICCV 2025

by Luca Bartolomei^1,2, Enrico Mannocci², Fabio Tosi², Matteo Poggi^1,2, and Stefano Mattoccia^1,2

Advanced Research Center on Electronic System (ARCES)¹ Department of Computer Science and Engineering (DISI)²

University of Bologna

Depth AnyEvent: A Cross-Modal Distillation Paradigm for Event-Based Monocular Depth Estimation (ICCV 2025)

Project Page | Paper

Proposed Cross-Modal Distillation Strategy. During training, a VFM teacher processes RGB input frames to generate proxy depth labels, which supervise an event-based student model. The student takes aligned event stacks as input and predicts the final depth map.

Note: 🚧 Kindly note that this repository is currently in the development phase. We are actively working to add and refine features and documentation. We apologize for any inconvenience caused by incomplete or missing elements and appreciate your patience as we work towards completion.

🎬 Introduction

Monocular depth perception from cameras is crucial for applications such as autonomous navigation and robotics. While conventional cameras have enabled impressive results, they struggle in highly dynamic scenes and challenging lighting conditions due to limitations like motion blur and low dynamic range. Event cameras, with their high temporal resolution and dynamic range, address these issues but provide sparse information and lack large annotated datasets, making depth estimation difficult.

This project introduces a novel approach to monocular depth estimation with event cameras by leveraging Vision Foundation Models (VFMs) trained on images. The method uses cross-modal distillation to transfer knowledge from image-based VFMs to event-based networks, utilizing spatially aligned data from devices like the DAVIS Camera. Additionally, the project adapts VFMs for event-based depth estimation, proposing both a direct adaptation and a new recurrent architecture. Experiments on synthetic and real datasets demonstrate competitive or state-of-the-art results without requiring expensive depth annotations.

Contributions:

A novel cross-modal distillation paradigm that leverages the robust proxy labels obtained from image-based VFMs for monocular depth estimation.
An adapting strategy to cast existing image-based VFMs into the event domain effortlessly.
A novel recurrent architecture based on an adapted image-based VFM.
Adapting VFMs to the event domain yields state-of-the-art performance, and our distillation paradigm is competitive against the supervision from depth sensors.

🖋️ If you find this code useful in your research, please cite:

@InProceedings{Bartolomei_2025_ICCV,
    author    = {Bartolomei, Luca and Mannocci, Enrico and Tosi, Fabio and Poggi, Matteo and Mattoccia, Stefano},
    title     = {Depth AnyEvent: A Cross-Modal Distillation Paradigm for Event-Based Monocular Depth Estimation},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2025},
}

📥 Pretrained Models

Here, you will be able to download the weights of VFMs for the event domain.

You can download our pretrained models here.

📝 Code

The Test section contains scripts to evaluate depth estimation on MVSEC and DSEC datasets.

Please refer to the section for detailed instructions on setup and execution.

Warning:

With the latest updates in PyTorch, slight variations in the quantitative results compared to the numbers reported in the paper may occur.

🛠️ Setup Instructions

Dependencies: Ensure that you have installed all the necessary dependencies. The list of dependencies can be found in the ./requirements.txt file.
Set scripts variables: Each script needs the path to the virtual environment (if any) and to the dataset. Please set those variables before running the script.
Set config variables: Each JSON config file has a datapath key: update it accordingly to your environment.

💾 Datasets

We used two datasets for evaluation: MVSEC and DSEC.

MVSEC

Download the processed version of MVSEC here. Thanks to the authors of E2DEPTH for the amazing work.

Unzip the archives arranging them as shown in the data structure below:

MVSEC
├── test
│   ├── mvsec_dataset_day2
└── train
    ├── mvsec_outdoor_day1
    ├── mvsec_outdoor_night1
    ├── mvsec_outdoor_night2
    └── mvsec_outdoor_night3

DSEC

Download Images, Events, Disparities, and Calibration Files from the official website.

Unzip the archives, then you will get a data structure as follows:

DSEC
├── train
    ├── interlaken_00_c
    ...
    └── zurich_city_11_c

🚀 Test

To evaluate the tables in our paperuse this snippet:

bash scripts/test.sh

You should change the variables inside the script before launching it.

✉️ Contacts

For questions, please send an email to luca.bartolomei5@unibo.it

🙏 Acknowledgements

We would like to extend our sincere appreciation to the authors of the following projects for making their code available, which we have utilized in our work:

We would like to thank the authors of E2DEPTH for providing their code, which has been inspirational for our work.
We would like to thank the authors of DAv2 for providing their code and models, which has been inspirational for our work.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
configs/test		configs/test
dataset		dataset
images		images
models		models
scripts		scripts
.gitignore		.gitignore
README.md		README.md
evaluation.py		evaluation.py
losses.py		losses.py
requirements.txt		requirements.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Depth AnyEvent: A Cross-Modal Distillation Paradigm for Event-Based Monocular Depth Estimation (ICCV 2025)

Depth AnyEvent: A Cross-Modal Distillation Paradigm for Event-Based Monocular Depth Estimation (ICCV 2025)

Project Page | Paper

📑 Table of Contents

🎬 Introduction

📥 Pretrained Models

📝 Code

🛠️ Setup Instructions

💾 Datasets

MVSEC

DSEC

🚀 Test

✉️ Contacts

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Languages

bartn8/depthanyevent

Folders and files

Latest commit

History

Repository files navigation

Depth AnyEvent: A Cross-Modal Distillation Paradigm for Event-Based Monocular Depth Estimation (ICCV 2025)

Depth AnyEvent: A Cross-Modal Distillation Paradigm for Event-Based Monocular Depth Estimation (ICCV 2025) Project Page | Paper

📑 Table of Contents

🎬 Introduction

📥 Pretrained Models

📝 Code

🛠️ Setup Instructions

💾 Datasets

MVSEC

DSEC

🚀 Test

✉️ Contacts

🙏 Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Depth AnyEvent: A Cross-Modal Distillation Paradigm for Event-Based Monocular Depth Estimation (ICCV 2025)

Project Page | Paper

Packages