Skip to content

microsoft/UFO

UFO³ UFO logo : Weaving the Digital Agent Galaxy

From Single Device Agent to Multi-Device Galaxy

📖 Language / 语言: English | 中文

microsoft%2FUFO | Trendshift

arxivarxivPython VersionLicense: MITDocumentationYouTube

📚 Quick Links: 🌌 UFO³ README🖥️ UFO² README📖 Full Documentation


🎯 Choose Your Path

Galaxy logo UFO³ Multi-Device Agent Galaxy

✨ NEW & RECOMMENDED

Perfect for:

  • 🔗 Cross-device collaboration workflows
  • 📊 Complex multi-step automation
  • 🎯 DAG-based task orchestration
  • 🌍 Heterogeneous platform integration

Key Features:

  • Constellation: Task decomposition into executable DAGs
  • Dynamic DAG editing for adaptive workflow evolution
  • Asynchronous execution with parallel task coordination
  • Unified AIP protocol for secure agent communication

📖 Galaxy Documentation →
📖 Galaxy Quick Start →Online Docs

UFO² logo UFO² Desktop AgentOS

STABLE & BATTLE-TESTED

Perfect for:

  • 💻 Single Windows automation
  • ⚡ Quick task execution
  • 🎓 Learning agent basics
  • 🛠️ Simple workflows

Key Features:

  • Deep Windows OS integration
  • Hybrid GUI + API actions
  • Proven reliability
  • Easy setup
  • Can serve as Galaxy device agent

📖 UFO² Documentation →


🎬 See UFO³ Galaxy in Action

Watch how UFO³ Galaxy orchestrates complex workflows across multiple devices:

UFO³ Galaxy Demo

🎥 Click to watch: Cross-device task orchestration with UFO³ Galaxy


🌟 What's New in UFO³?

Evolution Timeline

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#E8F4F8','primaryTextColor':'#1A1A1A','primaryBorderColor':'#7CB9E8','lineColor':'#A8D5E2','secondaryColor':'#B8E6F0','tertiaryColor':'#D4F1F4','fontSize':'16px','fontFamily':'Segoe UI, Arial, sans-serif'}}}%%
graph LR
    A["<b>🎈 UFO</b><br/><span style='font-size:14px'>February 2024</span><br/><span style='font-size:13px; color:#666'><i>GUI Agent for Windows</i></span>"] 
    B["<b>🖥️ UFO²</b><br/><span style='font-size:14px'>April 2025</span><br/><span style='font-size:13px; color:#666'><i>Desktop AgentOS</i></span>"]
    C["<b>🌌 UFO³ Galaxy</b><br/><span style='font-size:14px'>November 2025</span><br/><span style='font-size:13px; color:#666'><i>Multi-Device Orchestration</i></span>"]
    
    A -->|Evolve| B
    B -->|Scale| C
    
    style A fill:#E8F4F8,stroke:#7CB9E8,stroke-width:2.5px,color:#1A1A1A,rx:15,ry:15
    style B fill:#C5E8F5,stroke:#5BA8D0,stroke-width:2.5px,color:#1A1A1A,rx:15,ry:15
    style C fill:#A4DBF0,stroke:#3D96BE,stroke-width:2.5px,color:#1A1A1A,rx:15,ry:15
Loading

🚀 UFO³ = Galaxy (Multi-Device Orchestration) + UFO² (Device Agent)

UFO³ introduces Galaxy, a revolutionary multi-device orchestration framework that coordinates intelligent agents across heterogeneous platforms. Built on five tightly integrated design principles:

  1. 🌟 Declarative Decomposition into Dynamic DAG - Requests decomposed into structured DAG with TaskStars and dependencies for automated scheduling and runtime rewriting

  2. 🔄 Continuous Result-Driven Graph Evolution - Living constellation that adapts to execution feedback through controlled rewrites and dynamic adjustments

  3. ⚡ Heterogeneous, Asynchronous & Safe Orchestration - Capability-based device matching with async execution, safe locking, and formally verified correctness

  4. 🔌 Unified Agent Interaction Protocol (AIP) - WebSocket-based secure coordination layer with fault tolerance and automatic reconnection

  5. 🛠️ Template-Driven MCP-Empowered Device Agents - Lightweight toolkit for rapid agent development with MCP integration for tool augmentation

Aspect UFO² UFO³ Galaxy
Architecture Single Windows Agent Multi-Device Orchestration
Task Model Sequential ReAct Loop DAG-based Constellation Workflows
Scope Single device, multi-app Multi-device, cross-platform
Coordination HostAgent + AppAgents ConstellationAgent + TaskOrchestrator
Device Support Windows Desktop Windows, Linux, Android (more coming)
Task Planning Application-level Device-level with dependencies
Execution Sequential Parallel DAG execution
Device Agent Role Standalone Can serve as Galaxy device agent
Complexity Simple to Moderate Simple to Very Complex
Learning Curve Low Moderate
Cross-Device Collaboration ❌ Not Supported ✅ Core Feature
Setup Difficulty ✅ Easy ⚠️ Moderate
Status ✅ LTS (Long-Term Support) ⚡ Active Development

🎓 Migration Path

For UFO² Users:

  1. Keep using UFO² – Fully supported, actively maintained
  2. 🔄 Gradual adoption – Galaxy can use UFO² as Windows device agent
  3. 📈 Scale up – Move to Galaxy when you need multi-device capabilities
  4. 📚 Learning resourcesMigration Guide

✨ Capabilities at a Glance

🌌 Galaxy Framework – What's Different?

🌟 Constellation Planning

User Request
     ↓
ConstellationAgent
     ↓
  [Task DAG]
   /   |   \
Task1 Task2 Task3
(Win) (Linux)(Mac)

Benefits:

  • Cross-device dependency tracking
  • Parallel execution optimization
  • Cross-device dataflow management

🎯 Device Assignment

Selection Criteria
  • Platform
  • Resource
  • Task requirements
  • Performance history
        ↓
  Auto-Assignment
        ↓
  Optimal Devices

Smart Matching:

  • Capability-based selection
  • Real-time resource monitoring
  • Dynamic reallocation

📊 Orchestration

Task1 → Running  ✅
Task2 → Pending  ⏸️
Task3 → Running  🔄
        ↓
   Completion
        ↓
   Final Report

Orchestration:

  • Real-time status updates
  • Automatic error recovery
  • Progress tracking with feedback

🪟 UFO² Desktop AgentOS – Core Strengths

UFO² serves dual roles: standalone Windows automation and Galaxy device agent for Windows platforms.

Feature Description Documentation
Deep OS Integration Windows UIA, Win32, WinCOM native control Learn More
Hybrid Actions GUI clicks + API calls for optimal performance Learn More
Speculative Multi-Action Batch predictions → 51% fewer LLM calls Learn More
Visual + UIA Detection Hybrid control detection for robustness Learn More
Knowledge Substrate RAG with docs, demos, execution traces Learn More
Device Agent Role Can serve as Windows executor in Galaxy orchestration Learn More

As Galaxy Device Agent:

  • Receives tasks from ConstellationAgent via Galaxy orchestration layer
  • Executes Windows-specific operations using proven UFO² capabilities
  • Reports status and results back to TaskOrchestrator
  • Participates in cross-device workflows seamlessly

🚀 Quick Start Guide

Choose your path and follow the detailed setup guide:

🌌 Galaxy Quick Start

For cross-device orchestration

# 1. Install
pip install -r requirements.txt

# 2. Configure ConstellationAgent
copy config\galaxy\agent.yaml.template config\galaxy\agent.yaml
# Edit and add your API keys

# 3. Configure devices
# Edit config\galaxy\devices.yaml to register your devices

# 4. Start device agents (with platform flags)
# Windows: Start server + client
# Linux: Start server + MCP servers + client  
# Mobile (Android): Start server + MCP servers + client
# See platform-specific guides for detailed setup

# 5. Launch Galaxy
python -m galaxy --interactive

📖 Complete Guide:

🪟 UFO² Quick Start

For Windows automation

# 1. Install
pip install -r requirements.txt

# 2. Configure
copy config\ufo\agents.yaml.template config\ufo\agents.yaml
# Edit and add your API keys

# 3. Run
python -m ufo --task <task_name>

📖 Complete Guide:

📋 Common Configuration

Both frameworks require LLM API configuration. Choose your provider:

OpenAI Configuration

For Galaxy (config/galaxy/agent.yaml):

CONSTELLATION_AGENT:
  REASONING_MODEL: false
  API_TYPE: "openai"
  API_BASE: "https://api.openai.com/v1/chat/completions"
  API_KEY: "sk-your-key-here"
  API_MODEL: "gpt-4o"

For UFO² (config/ufo/agents.yaml):

VISUAL_MODE: True
API_TYPE: "openai"
API_BASE: "https://api.openai.com/v1/chat/completions"
API_KEY: "sk-your-key-here"
API_MODEL: "gpt-4o"
Azure OpenAI Configuration

For Galaxy (config/galaxy/agent.yaml):

CONSTELLATION_AGENT:
  REASONING_MODEL: false
  API_TYPE: "aoai"
  API_BASE: "https://YOUR-RESOURCE.openai.azure.com"
  API_KEY: "your-azure-key"
  API_MODEL: "gpt-4o"
  API_DEPLOYMENT_ID: "your-deployment-id"

For UFO² (config/ufo/agents.yaml):

VISUAL_MODE: True
API_TYPE: "aoai"
API_BASE: "https://YOUR-RESOURCE.openai.azure.com"
API_KEY: "your-azure-key"
API_MODEL: "gpt-4o"
API_DEPLOYMENT_ID: "your-deployment-id"

💡 More LLM Options: See Model Configuration Guide for Qwen, Gemini, Claude, and more.


📚 Documentation Structure

🌌 Galaxy Documentation

📖 Technical Documentation:

🪟 UFO² Documentation

📖 Online Docs:


📢 Latest Updates

2025-11 – UFO³ Galaxy Framework Released 🌌

Major Research Breakthrough: Multi-Device Orchestration System

  • 🌟 Declarative DAG Decomposition: TaskConstellation structure for workflow logic and dependencies
  • 🔄 Dynamic Graph Evolution: Living constellation that adapts through controlled rewrites
  • 🎯 Heterogeneous Orchestration: Safe, asynchronous execution with capability-based device matching
  • 🔌 Unified AIP Protocol: WebSocket-based secure agent coordination with fault tolerance
  • 🛠️ MCP-Empowered Agent Framework: Template-driven toolkit for rapid device agent development
  • 📄 Research Paper: UFO³: Weaving the Digital Agent Galaxy

Key Features:

  • First multi-device orchestration framework for GUI agents
  • Result-driven adaptive execution instead of rigid workflows
  • Model Context Protocol (MCP) integration for tool augmentation
  • Formally verified correctness and concurrency safety guarantees

2025-04 – UFO² v2.0.0

  • 📅 UFO² Desktop AgentOS released
  • 🏗️ Enhanced architecture with AgentOS concept
  • 📄 Technical Report published
  • ✅ Entered Long-Term Support (LTS) status

2024-02 – Original UFO

  • 🎈 First UFO release - UI-Focused agent for Windows
  • 📄 Original Paper
  • 🌍 Wide media coverage and adoption

📚 Citation

If you use UFO³ Galaxy or UFO² in your research, please cite the relevant papers:

UFO³ Galaxy Framework (2025)

@article{zhang2025ufo3,
  title={UFO$^3$: Weaving the Digital Agent Galaxy}, 
  author = {Zhang, Chaoyun and Li, Liqun and Huang, He and Ni, Chiming and Qiao, Bo and Qin, Si and Kang, Yu and Ma, Minghua and Lin, Qingwei and Rajmohan, Saravan and Zhang, Dongmei},
  journal = {arXiv preprint arXiv:2511.11332},
  year    = {2025},
}

UFO² Desktop AgentOS (2025)

@article{zhang2025ufo2,
  title   = {{UFO2: The Desktop AgentOS}},
  author  = {Zhang, Chaoyun and Huang, He and Ni, Chiming and Mu, Jian and Qin, Si and He, Shilin and Wang, Lu and Yang, Fangkai and Zhao, Pu and Du, Chao and Li, Liqun and Kang, Yu and Jiang, Zhao and Zheng, Suzhen and Wang, Rujia and Qian, Jiaxu and Ma, Minghua and Lou, Jian-Guang and Lin, Qingwei and Rajmohan, Saravan and Zhang, Dongmei},
  journal = {arXiv preprint arXiv:2504.14603},
  year    = {2025}
}

Original UFO (2024)

@article{zhang2024ufo,
  title   = {{UFO: A UI-Focused Agent for Windows OS Interaction}},
  author  = {Zhang, Chaoyun and Li, Liqun and He, Shilin and Zhang, Xu and Qiao, Bo and Qin, Si and Ma, Minghua and Kang, Yu and Lin, Qingwei and Rajmohan, Saravan and Zhang, Dongmei and Zhang, Qi},
  journal = {arXiv preprint arXiv:2402.07939},
  year    = {2024}
}

🌐 Media & Community

Media Coverage:

Community:


🎨 Related Projects & Research

Microsoft Research:

  • TaskWeaver – Code-first LLM agent framework for data analytics and task automation

GUI Agent Research:

Multi-Agent Systems:

  • UFO³ Galaxy represents a novel approach to multi-device orchestration, introducing the Constellation framework for coordinating heterogeneous agents across platforms
  • Builds on multi-agent coordination research while addressing unique challenges of cross-device GUI automation

Benchmarks:


💡 FAQ

🤔 Should I use Galaxy or UFO²?

Start with UFO² if:

  • You only need Windows automation
  • You want quick setup and learning
  • Tasks are relatively simple

Choose Galaxy if:

  • You need cross-device coordination
  • Tasks are complex and multi-step
  • You want advanced orchestration
  • You're comfortable with active development

Hybrid approach if:

  • You want best of both worlds
  • Some tasks are simple (UFO²), some complex (Galaxy)
  • You're gradually migrating
⚠️ Will UFO² be deprecated?

No! UFO² has entered Long-Term Support (LTS) status:

  • ✅ Actively maintained
  • ✅ Bug fixes and security updates
  • ✅ Performance improvements
  • ✅ Full community support
  • ✅ No plans for deprecation

UFO² is the stable, proven solution for Windows automation.

🔄 How do I migrate from UFO² to Galaxy?

Migration is gradual and optional:

  1. Phase 1: Learn – Understand Galaxy concepts
  2. Phase 2: Experiment – Try Galaxy with non-critical tasks
  3. Phase 3: Hybrid – Use both frameworks
  4. Phase 4: Migrate – Gradually move complex tasks to Galaxy

No forced migration! Continue using UFO² as long as it meets your needs.

See Migration Guide for details.

🎯 Can Galaxy do everything UFO² does?

Functionally: Yes. Galaxy can use UFO² as a Windows device agent.

Practically: It depends.

  • For simple Windows tasks: UFO² standalone is easier and more streamlined
  • For complex workflows: Galaxy orchestrates UFO² with other device agents

Recommendation: Use the right tool for the job. UFO² can work standalone or as Galaxy's Windows device agent.

📊 How mature is Galaxy?

Status: Active Development 🚧

Stable:

  • ✅ Core architecture
  • ✅ DAG orchestration
  • ✅ Basic multi-device support
  • ✅ Event system

In Development:

  • 🔨 Advanced device types
  • 🔨 Enhanced monitoring
  • 🔨 Performance optimization
  • 🔨 Extended documentation

Recommendation: Great for experimentation and non-critical workflows.

🔧 Can I extend or customize?

Both frameworks are highly extensible:

UFO²:

  • Custom actions and automators
  • Custom knowledge sources (RAG)
  • Custom control detectors
  • Custom evaluation metrics

Galaxy:

  • Custom agents
  • Custom device types
  • Custom orchestration strategies
  • Custom visualization components

See respective documentation for extension guides.

🤝 How can I contribute?

We welcome contributions to both UFO² and Galaxy!

Ways to contribute:

  • 🐛 Report bugs and issues
  • 💡 Suggest features and improvements
  • 📝 Improve documentation
  • 🧪 Add tests and examples
  • 🔧 Submit pull requests

See CONTRIBUTING.md for guidelines.


⚠️ Disclaimer & License

Disclaimer: By using this software, you acknowledge and agree to the terms in DISCLAIMER.md.

License: This project is licensed under the MIT License.

Trademarks: Use of Microsoft trademarks follows Microsoft's Trademark Guidelines.


🚀 Ready to Get Started?

🌌 Explore Galaxy

Multi-Device Orchestration

Start Galaxy

🪟 Try UFO²

Windows Desktop Agent

Start UFO²


© Microsoft 2025 | UFO³ is an open-source research project

⭐ Star us on GitHub | 🤝 Contribute | 📖 Read the docs | 💬 Join discussions


UFO logo
From Single Agent to Digital Galaxy
UFO³ - Weaving the Future of Intelligent Automation