Support for Multi-Modal Generation #417
Replies: 1 comment
-
|
An example would be the ARC-AGI benchmarks, where evolutionary program synthesis is already the top-performing approach. The ARC Prize 2025 technical report identifies evolutionary program synthesis as the defining theme of last year's winning results. For ARC-AGI-1/2, you'd evolve Python programs that transform colored grids, with the evaluator checking pixel-perfect matches and optionally using a VLM to provide visual feedback on patterns. Even more interesting is ARC-AGI-3, launching March 25, 2026. It moves from static grids to interactive game-like environments where agents must explore, learn, and adapt through visual perception. You could use OpenEvolve to evolve agent strategies that process game frames (images) and select actions, with the evaluator scoring based on game completion efficiency. This is multi-modal by design since the agent needs to perceive visual state, reason about mechanics, and plan actions. We'd love to see someone build an ARC-AGI example for OpenEvolve. Happy to help scope it out! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all,
For now, do OpenEvolve framework support "multi-modal generation"? If so, would you tell me what "multi-modal" task looks like?
Beta Was this translation helpful? Give feedback.
All reactions