Depth estimation is the task of measuring the distance of each pixel relative to the camera. This repo hosts a C++ and python implementation of the Depth-Anything Monocular Depth Estimation model, leveraging the TensorRT API for efficient real-time inference.
The inference time includes the pre-preprocessing and post-processing stages:
| Device | Model | Model Input (WxH) | Image Resolution (WxH) | Inference Time(ms) |
|---|---|---|---|---|
| RTX4090 | Depth-Anything-S |
518x518 | 1280x720 | 3 |
| RTX4090 | Depth-Anything-B |
518x518 | 1280x720 | 6 |
| RTX4090 | Depth-Anything-L |
518x518 | 1280x720 | 12 |
Note that the inference was conducted using FP16 precision, with a warm-up period of 10 frames, and the reported time corresponds to the last inference.
Linux:
# infer image
./depth-anything-tensorrt depth_anything_vitb14.engine test.jpg
# infer folder(images)
./depth-anything-tensorrt depth_anything_vitb14.engine data
# infer video
./depth-anything-tensorrt depth_anything_vitb14.engine test.mp4 # the video pathWindows:
# infer image
./depth-anything-tensorrt.exe depth_anything_vitb14.engine test.jpg
# infer folder(images)
./depth-anything-tensorrt.exe depth_anything_vitb14.engine data
# infer video
./depth-anything-tensorrt.exe depth_anything_vitb14.engine test.mp4 # the video pathRefer to our docs/INSTALL.md for detailed installation instructions.
This project is based on the following projects:
- Depth-Anything - Unleashing the Power of Large-Scale Unlabeled Data.
- TensorRT - TensorRT samples and api documentation.
