Skip to content

Interactive household task plan generator on iTHOR simulation environment

License

Notifications You must be signed in to change notification settings

xHugo21/ai2thor-task-planner

Repository files navigation

Task execution on the iTHOR simulator using automated planning and neural networks.

📃 Description

The main objective is to develop a program that allows the user to execute any action inside an iTHOR environment.

There are two ways of running the program:

  1. Using metadata given by the simulator. Thanks to it we can know which objects are in a specific scene and their positions. Using this data we can then generate a PDDL problem to obtain an optimized plan. The plan is translated back to executable actions and triggered in order.

  2. Using OGAMUS algorithm. OGAMUS is an algorithm developed by Leonardo Lamanna, Luciano Serafini, Alessandro Saetti, Alfonso Gerevini y Paolo Traverso which scans an iTHOR scene using pretrained neural network models and stores all the data it gets inside PDDL problem files. In this project the algorithm has been modified so it can run within an specific environment and so that actions can be chained. There is also the possibility to pass a PDDL problem as argument and translate the actions that want to be executed.

🚀 Local Setup with uv

This project uses uv for dependency management.

1. Install uv

Follow the installation instructions for your platform.

2. Install dependencies

Choose the version that matches your hardware:

For CPU:

uv sync --extra cpu

For GPU (NVIDIA):

uv sync --extra gpu

3. Metric-FF Planner

The planner is included in the repository as a pre-compiled binary (ff). Ensure it has execution permissions:

chmod +x ff

4. Run the program

uv run python main.py

🐳 Running with Docker

1. Allow X11 connections

To enable GUI visualization from the container:

xhost +local:docker

2. Build and Run

CPU version:

docker build -t ai2thor-task-planner:cpu -f Dockerfile .
docker run -it --rm \
    -e DISPLAY=$DISPLAY \
    -v /tmp/.X11-unix:/tmp/.X11-unix:rw \
    ai2thor-task-planner:cpu

GPU version (requires NVIDIA Docker):

docker build -t ai2thor-task-planner:gpu -f Dockerfile.gpu .
docker run -it --rm --gpus all \
    -e DISPLAY=$DISPLAY \
    -v /tmp/.X11-unix:/tmp/.X11-unix:rw \
    ai2thor-task-planner:gpu

3. Persisting Output Files

To save generated images, PDDL files, and results on your host machine, mount the corresponding volumes:

docker run -it --rm \
    -e DISPLAY=$DISPLAY \
    -v /tmp/.X11-unix:/tmp/.X11-unix:rw \
    -v $(pwd)/images:/app/images \
    -v $(pwd)/pddl/outputs:/app/pddl/outputs \
    -v $(pwd)/pddl/problems:/app/pddl/problems \
    -v $(pwd)/results:/app/results \
    ai2thor-task-planner:cpu

👀 results visualization

iTHOR simulator launches a visualization window every time an environment is generated. However, it is pretty hard to see if everything has executed correctly. The program extracts the following data on each action executed:

  • scene.png: A zenithal shot of the scene so that the user can see the layout of the room. It is generated in /images/scene.png

Zenithal shot of the scene FloorPlan1

  • problemX_Y: An image of each step executed. X represents the action and Y the step.

The agent positions in front of the objective: iter0_1 The agent picks up the objective: iter0_2

  • CLI data: When an action is finished, status about last action and objective is displayed.

  • PDDL problem files in /pddl/problems/

  • Plans generated in /pddl/outputs/

✏️ References

About

Interactive household task plan generator on iTHOR simulation environment

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published