Add dynamic prompt template system for skin-aware VLM evaluation #9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces a dynamic prompt template system that enables VLM evaluation prompts to adapt to different visual themes (skins). Instead of using hardcoded visual descriptions, prompts now dynamically load descriptions from each skin's
description.jsonfile.Motivation
Previously, prompt templates contained fixed visual descriptions (e.g., "red wall", "blue player") that didn't match the actual skin being evaluated. This caused a mismatch between what the VLM sees in the image/video and what the prompt describes, potentially affecting evaluation accuracy.
Changes
Core System (
evaluation/vlm_eval/prompts/)__init__.py: Addedload_skin_description()andget_dynamic_prompt()functionsvisual_descriptionfrom skin'sdescription.json3dmaze→maze3d)*_prompt.py: Converted all prompt templates to use placeholdersmaze_prompt.py:{wall},{floor},{player},{goal}sokoban_prompt.py:{wall},{floor},{player},{box},{target}trapfield_prompt.py:{floor},{trap},{player},{goal}pathfinder_prompt.py:{wall},{floor},{player},{goal}maze3d_prompt.py:{start_cube},{goal_cube},{default_cube},{ball}Executors (
evaluation/vlm_eval/executors/)get_dynamic_prompt()instead of static templates3D Maze Enhancements (
games/maze3d/)color_handler.py(new): Manages color configuration loading from skin directoriesadapter.py: Now requiresassets_folder, removed default color fallbackmain.py: Strict color validation, removed all default value fallbacksSkin Description Files (
skins/)Added
description.jsonfor all existing skins:Added
colors.jsonfor maze3d skins (rendering colors).Utilities
scripts/analyze_skins_description.py: Tool to analyze and generatedescription.jsonfiles from skin texturesBreaking Changes
description.json: Missing file or required keys will raise an errorcolors.json: No default colors fallbackassets_folderis now required for VLM evaluationFile Structure
Example
description.json:{ "visual_description": { "wall": "gray stone wall", "floor": "wooden floor", "player": "red character", "goal": "green flag" } }