GRPO DSL only supports CLICK/TYPE/WAIT/DONE. The schema (episode.py) supports 24 action types. RL training requires scroll, key, and noop at minimum.
_format_action_as_text silently converts unknown types to DONE, causing gradient signal loss. Extend parser, formatter, and prompt to support the full action vocabulary.