Hi, congrats on this awesome work!
I noticed that the paper mentions "including" when listing the training datasets. Could you confirm whether the list is exhaustive? In particular, was the model trained on NYUv2? I ask because I get very good monocular depth estimation scores for VGGT on NYUv2 in my pipeline.