Skip to content

Conversation

@ImYangC7
Copy link
Collaborator

This PR introduces a dynamic prompt template system that enables VLM evaluation prompts to adapt to different visual themes (skins). Instead of using hardcoded visual descriptions, prompts now dynamically load descriptions from each skin's description.json file.

Motivation

Previously, prompt templates contained fixed visual descriptions (e.g., "red wall", "blue player") that didn't match the actual skin being evaluated. This caused a mismatch between what the VLM sees in the image/video and what the prompt describes, potentially affecting evaluation accuracy.

Changes

Core System (evaluation/vlm_eval/prompts/)

  • __init__.py: Added load_skin_description() and get_dynamic_prompt() functions

    • Loads visual_description from skin's description.json
    • Replaces template placeholders with skin-specific descriptions
    • Raises explicit errors instead of falling back to defaults
    • Added game name aliases support (e.g., 3dmazemaze3d)
  • *_prompt.py: Converted all prompt templates to use placeholders

    • maze_prompt.py: {wall}, {floor}, {player}, {goal}
    • sokoban_prompt.py: {wall}, {floor}, {player}, {box}, {target}
    • trapfield_prompt.py: {floor}, {trap}, {player}, {goal}
    • pathfinder_prompt.py: {wall}, {floor}, {player}, {goal}
    • maze3d_prompt.py: {start_cube}, {goal_cube}, {default_cube}, {ball}

Executors (evaluation/vlm_eval/executors/)

  • Updated all executors to use get_dynamic_prompt() instead of static templates

3D Maze Enhancements (games/maze3d/)

  • color_handler.py (new): Manages color configuration loading from skin directories
  • adapter.py: Now requires assets_folder, removed default color fallback
  • main.py: Strict color validation, removed all default value fallbacks

Skin Description Files (skins/)

Added description.json for all existing skins:

Game Skins
maze 1, 2, 3, 5
sokoban 1, 2, 3, 4, 5
trapfield 1, 2, 3, 4
pathfinder 1, 2, 3, 4
maze3d 1, 2, 3, 4

Added colors.json for maze3d skins (rendering colors).

Utilities

  • scripts/analyze_skins_description.py: Tool to analyze and generate description.json files from skin textures

Breaking Changes

  • Skin folders now require description.json: Missing file or required keys will raise an error
  • 3D Maze requires colors.json: No default colors fallback
  • assets_folder is now required for VLM evaluation

File Structure

skins/<game_type>/<skin_id>/
├── description.json    # Required: visual descriptions for prompts
├── colors.json         # Required for maze3d: rendering colors
└── *.png              # Texture files

Example description.json:

{
  "visual_description": {
    "wall": "gray stone wall",
    "floor": "wooden floor",
    "player": "red character",
    "goal": "green flag"
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants