This is the official repository for the paper:
Nav-R1: Reasoning and Navigation in Embodied Scenes
Qingxiang Liu*, Ting Huang*, Zeyu Zhang*†, and Hao Tang#
*Equal contribution. †Project lead. #Corresponding author.
teaser_video_compressed.mp4
If you find our code or paper helpful, please consider starring ⭐ us and citing:
@article{liu2025navr1,
title={Nav-R1: Reasoning and Navigation in Embodied Scenes},
author={Liu, Qingxiang and Huang, Ting and Zhang, Zeyu and Tang, Hao},
journal={arXiv preprint arXiv:2509.10884},
year={2025}
}
Nav-R1 is an embodied foundation model that integrates dialogue, reasoning, planning, and navigation capabilities to enable intelligent interaction and task execution in 3D environments.
Embodied navigation requires agents to integrate perception, reasoning, and action for robust interaction in complex 3D environments. Existing approaches often suffer from incoherent and unstable reasoning traces that hinder generalization across diverse environments, and difficulty balancing long-horizon semantic reasoning with low-latency control for real-time navigation. To address these challenges, we propose Nav-R1, an embodied foundation model that unifies reasoning in embodied environments. We first construct Nav-CoT-110K, a large-scale dataset of step-by-step Chains-of-Thought (CoT) for embodied tasks, which enables cold-start initialization with structured reasoning. Building on this foundation, we design a GRPO-based reinforcement learning framework with three complementary rewards: format, understanding, and navigation, to improve structural adherence, semantic grounding, and path fidelity. Furthermore, we introduce a Fast-in-Slow reasoning paradigm, decoupling deliberate semantic reasoning from low-latency reactive control for efficient yet coherent navigation. Extensive evaluations on embodied AI benchmarks demonstrate that Nav-R1 consistently outperforms strong baselines, with over 8% average improvement in reasoning and navigation performance. Real-world deployment on a mobile robot further validates its robustness under limited onboard resources.
2025/09/19: 🎉 Our paper has been shared by Embodied Intelligent Mind.
2025/09/18: 📌 Our paper has been promoted by AIxiv.
- Release Nav-CoT-110K dataset. (see Nav-CoT-110K)
- Upload our paper to arXiv and build project pages.
- Upload the code.
Note
If you’d like to learn more about our paper, be sure to check out this youtube video by @AIResearchRoundup.
conda create -n navr1 python=3.10 -y
conda activate navr1
pip install -r requirements.txt
# Install Habitat-Lab and Habitat-Sim per official instructions for your OS/CUDA- You can download the
Nav-CoT-110Kfrom huggingface and setdataset.pathto a folder containingtrain.jsonl,val.jsonl,test.jsonlwith fields:instruction,history_images,action_space,target. - Or build your own synthetic dataset from R2R/RxR; see
build_data/README.mdfor concise steps and commands (annotation, images, and CoT generation).
Update navr1/configs/default.yaml or pass --config to scripts.
# SFT training (4 GPUs)
python train.py --config navr1/configs/default.yaml --mode sft --workdir runs/navr1 --num_gpus 4 --use_ddp# Multi-GPU RL training with SFT checkpoint (8 GPUs)
python train.py --config navr1/configs/default.yaml --mode rl --resume runs/navr1/sft_checkpoint.pt --workdir runs/navr1_rl --num_gpus 8 --use_ddp# Embodied Tasks Training with RL checkpoint
python train.py --config navr1/configs/default.yaml --mode embodied --embodied_task dialogue --resume runs/navr1_rl/rl_checkpoint.pt --workdir runs/navr1_dialogue --num_gpus 4 --use_ddp
python train.py --config navr1/configs/default.yaml --mode embodied --embodied_task reasoning --resume runs/navr1_rl/rl_checkpoint.pt --workdir runs/navr1_reasoning --num_gpus 4 --use_ddp
python train.py --config navr1/configs/default.yaml --mode embodied --embodied_task planning --resume runs/navr1_rl/rl_checkpoint.pt --workdir runs/navr1_planning --num_gpus 4 --use_ddp
python train.py --config navr1/configs/default.yaml --mode embodied --embodied_task vln --resume runs/navr1_rl/rl_checkpoint.pt --workdir runs/navr1_vln_embodied --num_gpus 4 --use_ddp
python train.py --config navr1/configs/default.yaml --mode embodied --embodied_task objectnav --resume runs/navr1_rl/rl_checkpoint.pt --workdir runs/navr1_objectnav_embodied --num_gpus 4 --use_ddp# Run complete 3-stage pipeline (SFT → RL → Embodied Tasks)
python train_pipeline.py --config navr1/configs/default.yaml --workdir runs/navr1_pipeline
# Run specific stages
python train_pipeline.py --config navr1/configs/default.yaml --stage sft --workdir runs/navr1_pipeline
python train_pipeline.py --config navr1/configs/default.yaml --stage rl --workdir runs/navr1_pipeline
python train_pipeline.py --config navr1/configs/default.yaml --stage embodied_finetune --embodied_task dialogue --workdir runs/navr1_pipeline# Evaluate on validation set
python evaluate.py --config navr1/configs/default.yaml --split val --episodes 50
# Evaluate on test set
python evaluate.py --config navr1/configs/default.yaml --split test --episodes 100
# Evaluate with specific checkpoint
python evaluate.py --config navr1/configs/default.yaml --checkpoint runs/navr1/checkpoint.pt --split val --episodes 50# Evaluate VLN model
python evaluate.py --config navr1/configs/default.yaml --checkpoint runs/navr1_vln/checkpoint.pt --task_type vln --split val --episodes 50
# Evaluate ObjectNav model
python evaluate.py --config navr1/configs/default.yaml --checkpoint runs/navr1_objectnav/checkpoint.pt --task_type objectnav --split val --episodes 50
# Evaluate 3D scene understanding models
python evaluate.py --config navr1/configs/default.yaml --checkpoint runs/navr1_scanrefer/checkpoint.pt --task_type scanrefer --split val --episodes 50
python evaluate.py --config navr1/configs/default.yaml --checkpoint runs/navr1_scanqa/checkpoint.pt --task_type scanqa --split val --episodes 50
python evaluate.py --config navr1/configs/default.yaml --checkpoint runs/navr1_nr3d/checkpoint.pt --task_type nr3d --split val --episodes 50
python evaluate.py --config navr1/configs/default.yaml --checkpoint runs/navr1_scene30k/checkpoint.pt --task_type scene30k --split val --episodes 50
# Evaluate embodied task models
python evaluate.py --config navr1/configs/default.yaml --checkpoint runs/navr1_dialogue/checkpoint.pt --task_type dialogue --split val --episodes 50
python evaluate.py --config navr1/configs/default.yaml --checkpoint runs/navr1_reasoning/checkpoint.pt --task_type reasoning --split val --episodes 50
python evaluate.py --config navr1/configs/default.yaml --checkpoint runs/navr1_planning/checkpoint.pt --task_type planning --split val --episodes 50- Habitat-Lab Simulator is the sole supported simulation backend. Please ensure habitat-lab and habitat-sim are correctly installed, and that
simulator.habitat_configpoints to your task YAML (such as VLN R2R or ObjectNav HM3D). - For 3D scene understanding tasks, ensure you have the required datasets (ScanRefer, ScanQA, Nr3D, Scene-30K) downloaded and properly configured.
- RL training requires significant computational resources. Consider using multiple GPUs and adjusting batch sizes accordingly.
- The model supports both CPU and GPU training, but GPU is strongly recommended for reasonable training times.
- Multi-GPU training is highly recommended for large-scale training and can provide 3-8x speedup depending on the number of GPUs.
Start from the beginning, walk to the side table on your right and pause there. Then go straight towards the front-left and stop at the wall.
real_world_VLN-1.mp4
Search for a chair.
simulator_ObjectNav-1.mp4
We thank the authors of 3D-R1, DeepSeek-Math, and Habitat-Lab for their open-source code.

