Skip to content

EnVision-Research/Lotus-2

Repository files navigation

lotus Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Model

Page Paper HuggingFace Demo HuggingFace Demo HuggingFace

Jing He1, Haodong Li12, Mingzhi Sheng1, Ying-Cong Chen13✉

1HKUST(GZ) 2UC San Diego 3HKUST
Both authors contributed equally. Corresponding author.

teaser

We present Lotus-2, a two-stage deterministic framework for monocular geometric dense prediction. Our method leverages pre-trained generative model as a deterministic world prior to achieve new state-of-the-art accuracy while requiring remarkably minimal data (trained on only 0.66% of the samples used by MoGe-2). This figure demonstrates Lotus-2's robust zero-shot generalization with sharp geometric details, especially in challenging cases like oil paintings and transparent objects.

🚀🚀🚀 Please also check the Project Page and Github Repo our prior work: Lotus! 🚀🚀🚀

📢 News

  • 2025-12-01: Paper released!
  • 2025-11-28: The inference code and HuggingFace demo (Depth & Normal) are available!

🛠️ Setup

This installation was tested on: Ubuntu 20.04 LTS, Python 3.10, CUDA 12.3, NVIDIA A800-SXM4-80GB.

  1. Be sure you have a GPU with at least 40GB memory.
  2. Clone the repository (requires git):
    git clone https://github.com/EnVision-Research/Lotus-2.git
    cd Lotus-2
    
  3. Install dependencies (requires conda):
    conda create -n lotus2 python=3.10 -y
    conda activate lotus2
    pip install -r requirements.txt 
    
  4. Be sure you have access to black-forest-labs/FLUX.1-dev.
  5. Login your huggingface account via (if you want to switch account, run hf auth logout at first):
    hf auth login
    

🤗 Gradio Demo

  1. Online demo: Depth & Normal
  2. Local demo:
  • For depth estimation, run:
    python app.py depth
    
  • For normal estimation, run:
    python app.py normal
    

🕹️ Inference

  1. Place your images in a directory, for example, under ./assets/in-the-wild_example (where we have already prepared several examples).
  2. Run the inference command:
    sh infer.sh
    
  • Note: The inference code will automatically download the required model weights. You also can download them manually using the HuggingFace CLI:
    hf download jingheya/Lotus-2 --local-dir <path/to/your/local/directory>
    
    Use the following arguments to specify the paths: --core_predictor_model_path, --lcm_model_path, and --detail_sharpener_model_path.

🚀 Evaluation

  1. Prepare benchmark datasets:
  • For depth estimation, please download the Marigold evaluation datasets via:
    cd datasets/eval/depth/
    
    wget -r -np -nH --cut-dirs=4 -R "index.html*" -P . https://share.phys.ethz.ch/~pf/bingkedata/marigold/evaluation_dataset/
    
  • For normal estimation, please (manually) download the DSINE evaluation datasets (dsine_eval.zip) under: datasets/eval/normal/ and unzip it.
  1. Run the evaluation command (modify the TASK_NAME in eval.sh to switch tasks):
    sh eval.sh
    

🎓 Citation

If you find our work useful in your research, please consider citing our paper:

@article{he2025lotus,
  title={Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Model},
  author={He, Jing and Li, Haodong and Sheng, Mingzhi and Chen, Ying-Cong},
  journal={arXiv preprint arXiv:2512.01030},
  year={2025}
}

About

Official implementation of Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •