We use vanishing points and camera intrinsics to extract the dominant manhattan frame in an image to test the orthogonality constraints expected from real-world architectures. This method obtained an F1-Score of 0.75 on a mixed real/generated images handmade benchmark. This method showed signs of robustness to resampling, a common weakness of classical learning-based detectors, although more work is needed to explore it.
Tested with:
- Ubuntu 24.04
- Python 3.12
- CUDA 12.6
Clone the repository and set up the environment:
git clone --recurse-submodules git@github.com:Tetchki/cs413-project.git
cd cs413-project
-
Create a Python 3.12 virtual environment
-
Install pip-tools
This project uses
pip-tools
to manage dependencies in a reproducible way.pip install pip-tools
-
Compile requirements.txt from requirements.in
Resolves all dependencies from
requirements.in
into a fully pinnedrequirements.txt
.Note: This may take a while.
pip-compile --verbose requirements.in
-
Install dependencies
Install all dependencies and their exact versions as specified in
requirements.txt
.Note: This may take a while.
pip install -r requirements.txt
-
Install the DeepLSD submodule
Some code in this project requires pre-trained weights for DeepLSD. Download them with:
python3 download_deeplsd_weights.py
If you prefer to download the weights manually instead of using python3 download_deeplsd_weights.py
, follow these steps:
-
Ensure the DeepLSD submodule is initialized:
-
Confirm that the folder
ext/DeepLSD
exists. If it doesn’t, run:git submodule update --init --recursive
-
-
Create the weights folder:
-
Inside the
ext/DeepLSD
directory, create a subfolder namedweights
. -
The folder structure should look like this:
ext/ └── DeepLSD/ └── weights/ └── ...
-
-
Download the required model files:
-
Move the downloaded files into the
weights/
folder:- Place both
.tar
files intoext/DeepLSD/weights/
.
- Place both
The demo is provided in the Jupyter notebook called playground.ipynb
. It runs on two example images located in the data/
folder.
To run the demo:
- Open
playground.ipynb
in Jupyter. - Run the cells sequentially to execute the full pipeline on the sample images.
You can pass verbose=True
to the main pipeline function if you want to see intermediate results and detailed information about each processing step.
Make sure all dependencies are installed and the data/
folder contains the required images before running the notebook.
- The
data/
directory contains the datasets used for finetuning and evaluation. - You can download the full data folder from here.
- The
data/synthbuster/keep
folder contains the 100 hand-picked building examples from the Synthbuster dataset that fit our scene geometry assumptions. If you want you can download the full Synthbuster dataset from here. Put its content inside thedata/synthbuster/
folder. - Use the
extract_dataset.ipynb
notebook to hand-pick building examples from the Synthbuster dataset that fit our scene geometry assumptions.
- Use
intrinsics_experiments.ipynb
to compare the performance of GeoCalib and Perspective Field methods for camera intrinsics estimation.
pipeline.ipynb
is the main notebook to run the full geometry-based detection pipeline and generate benchmark results.
playground.ipynb
provides a minimal demo to test the system on one real and one generated image for quick validation.
- This project assumes a focus on geometric priors like vanishing points, line segment distributions, and camera intrinsics to identify generated images.
- The DeepLSD and Geolib submodules are required for full functionality.
Real Image | Generated Image |
---|---|
![]() |
![]() |
Real Output (correct Manhattan Frame) | Generated Output (incorrect Manhattan Frame) |
![]() |
![]() |