This project focuses on training Large Multimodal Models (LMMs) to detect and appropriately respond to unsolvable robotic tasks. Using a combination of synthetic data generation and fine-tuning techniques, we develop a model that can effectively identify when a requested task is beyond a robot's capabilities and provide clear explanations for why the task cannot be performed.
- Synthetic data generation for unsolvable robotic tasks
- Fine-tuned LLaVA model for task feasibility detection
- Clear explanation generation for unsolvable tasks
- Support for both SD and Habitat-generated scenarios
Download our synthetic dataset from: Box
- sd_images: the SD dataset generated by SD
- generated_tasks.jsonl: the text of the robot responses explaining task limitations
- images: the images generated by SD
- habitat_images: the Habitat dataset generated by Habitat Simulator
- generated_tasks.jsonl: the text of the robot responses explaining task limitations
- images: the images generated by Habitat Simulator
Download our fine-tuned model weights from: Huggingface
Base model: LLaVA-1.5-7B
- Clone the repository:
bash
git clone https://github.com/linyueqian/ME555_Final_Project
cd ME555_Final_Project
- Install dependencies:
pip install -r requirements.txt
To run inference with our fine-tuned model:
python inference.py
This will load the model, process the input, and generate the output.
To generate synthetic data:
python task_generation/generate.py
python image_generation/generate.py
This will generate the synthetic data and save it to the specified output file.
If you find this project useful, please cite our project:
@software{lin_yueqian_2024_ME555_Final_Project,
author = {Lin, Yueqian and Yang, Yixuan},
license = {Apache-2.0},
title = {{ME555_Final_Project}},
url = {https://github.com/linyueqian/ME555_Final_Project}
}