SIMA Minecraft Agent

Two powerful approaches to AI agents for Minecraft:

🎮 Conversational Minecraft Agent (NEW!)

Chat with an AI that plays Minecraft for you!

A conversational assistant that:

💬 Chats in natural language: "Go mine that tree" or "Build a house here"
🎮 Controls real Minecraft: Direct keyboard/mouse automation
🤖 Executes complex tasks: Understands multi-step instructions
🧠 Learns from conversation: Remembers context and preferences

Perfect for: Playing Minecraft with an AI companion, automating repetitive tasks, learning Minecraft mechanics

Quick Start

# Install dependencies for chat agent
pip install loguru mss pyautogui pynput openai

# Check installation
python minecraft_chat.py --check-deps

# Start the conversational agent
python minecraft_chat.py

Usage Examples

# With OpenAI API key (recommended)
export OPENAI_API_KEY="your-api-key"
python minecraft_chat.py

# Basic mode (no API key needed)
python minecraft_chat.py

Example Conversation

👤 You: Go forward and mine some wood
🤖 Agent: I'll help you mine those trees! Moving forward and breaking the wood blocks.

👤 You: Now build a small house with that wood
🤖 Agent: Perfect! I'll use the wood we collected to build a cozy house. Starting with a 5x5 foundation...

Available Actions

Movement: "go forward", "turn left", "walk backward"
Mining: "break this block", "mine that tree", "dig down"
Building: "place a block", "build a wall", "make a platform"
Complex: "build a house", "find diamonds", "make a farm"

🔬 Research Agent (Original)

Self-improving agent for AI research

A modular research framework inspired by Google's SIMA 2:

Task Setter: Proposes tasks with estimated rewards
Agent: Executes tasks in game environments
Reward Model: Evaluates episode performance
Self-Generated Experience: Stores and manages episode data
Self-Improvement Loop: Orchestrates the learning cycle

Perfect for: AI research, reinforcement learning experiments, academic studies

🚀 Quick Start Options

Option 1: Conversational Agent (Recommended for most users)

# Install dependencies
pip install loguru mss pyautogui pynput openai

# Start chatting with your Minecraft assistant  
python minecraft_chat.py

# Example conversation:
# You: "go forward and mine some wood"
# Agent: "I'll move forward and break those wood blocks for you!"

Option 2: Research Agent (For AI researchers)

# Basic installation
pip install -e .

# Train the research agent
python -m src.main --mode train --env dummy --generations 5

# Train with Minecraft (requires MineRL)
pip install minerl
python -m src.main --mode train --env minecraft --generations 3

📚 Detailed Documentation

Conversational Agent Setup - Complete guide for the chat-based Minecraft assistant
Research Agent Setup - Detailed setup for AI research applications

🏗️ Architecture (Research Agent)

┌─────────────┐    ┌──────────────┐    ┌─────────────┐
│ Task Setter │────│    Agent     │────│ Game Env    │
└─────────────┘    └──────────────┘    └─────────────┘
       │                   │
       │            ┌──────▼──────┐
       │            │  Episodes   │
       │            └──────┬──────┘
       │                   │
       ▼            ┌──────▼──────┐
┌─────────────┐     │ Self-Gen.   │     ┌─────────────┐
│Reward Model │◄────│ Experience  │────►│   Storage   │
└─────────────┘     └─────────────┘     └─────────────┘

│ ├── config/ │ │ └── config.py # Configuration classes │ ├── env/ │ │ ├── base_env.py # Abstract game environment │ │ ├── dummy_env.py # Simple test environment │ │ ├── minecraft_env.py # Minecraft environment wrapper ⭐ │ │ ├── io_controller.py # Game I/O interfaces │ │ └── vision.py # Observation encoding │ ├── agent/ │ │ ├── policy.py # Policy implementations │ │ └── agent.py # Main agent orchestrator │ ├── tasks/ │ │ ├── task_schema.py # Task data structures │ │ └── task_setter.py # Task generation logic ⭐ │ ├── reward/ │ │ └── reward_model.py # Episode scoring ⭐ │ ├── experience/ │ │ ├── types.py # Core data types │ │ ├── buffer.py # In-memory experience store │ │ └── storage.py # Persistent experience store │ ├── training/ │ │ └── self_improvement_loop.py # Main training loop │ └── utils/ │ ├── logging_utils.py # Structured logging │ └── seed.py # Random seed management └── tests/ ├── test_experience.py # Experience system tests └── test_agent_interfaces.py # Agent integration tests


## Key Components

### Environment Interface
The `GameEnv` abstract base class provides a clean interface for any video game. Includes:
- `DummyGameEnv` for testing and development
- `MinecraftEnv` for Minecraft gameplay via MineRL or raw control

### Minecraft Support
The system now supports Minecraft through two modes:
- **MineRL**: Uses the MineRL research platform for structured Minecraft gameplay
- **Raw Control**: Direct keyboard/mouse control of Minecraft client (experimental)

Minecraft-specific features:
- Specialized action space for movement, building, mining, crafting
- Minecraft vision encoder for processing pixel observations
- Task templates for common Minecraft objectives (collect wood, build structures, etc.)
- Reward model understanding Minecraft progress indicators

### Experience System
Episodes are stored both in-memory (`ReplayBuffer`) and persistently (`storage.py`) as the "Self-Generated Experience" that drives learning.

### Task Generation
The `TaskSetter` proposes new tasks based on previous performance, with hooks for LLM-based task generation.

### Reward Learning
The `RewardModel` scores episodes, with clear interfaces for plugging in learned reward functions or LLM-based evaluation. Now includes Minecraft-specific scoring based on inventory changes, health preservation, and exploration.

## Minecraft Configuration

The Minecraft environment can be configured via CLI arguments or config files:

```python
# Example configuration
minecraft_config = MinecraftConfig(
    backend="minerl",  # "minerl" or "raw" 
    environment_name="MineRLNavigateDense-v0",  # MineRL environment
    max_steps=1000,
    frame_skip=1,
    render=True,
    action_space="discrete"  # "discrete" or "continuous"
)

MineRL Environments

Supported MineRL environments include:

MineRLNavigateDense-v0: Navigation with dense rewards
MineRLTreechop-v0: Tree chopping tasks
MineRLObtainDiamond-v0: Complex diamond obtaining task

Raw Control Mode

For direct Minecraft control:

Start Minecraft client
Use --env minecraft --minecraft-backend raw
Ensure Minecraft window is focused and accessible

Note: Raw control mode is experimental and requires additional setup.

Development

Running Tests

pytest tests/

Code Quality

black src/ tests/
ruff src/ tests/
mypy src/

TODOs for Production Use

Core System

RL Algorithms: Implement proper policy gradient/Q-learning in policy.py
Scalability: Add distributed training and experience storage
Vision Models: Plug in CNN/transformer backbones in vision.py

Environment Integration

More Games: Add support for other games beyond Minecraft
Real Input Control: Complete the raw Minecraft controller implementation
MineRL Integration: Test and optimize MineRL environment performance

Intelligence Components

Reward Learning: Train neural reward models or integrate LLM evaluation
Task Generation: Add LLM-based creative task generation
Hierarchical Tasks: Implement task decomposition and sub-goals

Minecraft-Specific Enhancements

Advanced Actions: Implement crafting, building, combat mechanics
State Detection: Add inventory tracking, health monitoring, environment analysis
Long-Term Goals: Add support for complex, multi-step Minecraft objectives

License

MIT License - see LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.vscode		.vscode
examples		examples
sima_like_agent.egg-info		sima_like_agent.egg-info
src		src
test_screenshots		test_screenshots
tests		tests
.gitignore		.gitignore
CONVERSATIONAL_AGENT.md		CONVERSATIONAL_AGENT.md
ENHANCEMENTS.md		ENHANCEMENTS.md
FOCUS_SOLUTION.md		FOCUS_SOLUTION.md
IMPLEMENTATION_COMPLETE.md		IMPLEMENTATION_COMPLETE.md
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
LICENSE		LICENSE
README.md		README.md
SETUP.md		SETUP.md
SIMA_ARCHITECTURE.md		SIMA_ARCHITECTURE.md
debug_vision.py		debug_vision.py
demo.py		demo.py
demo_autofocus.py		demo_autofocus.py
fix_minecraft_focus.py		fix_minecraft_focus.py
focus_minecraft.py		focus_minecraft.py
github_ready.sh		github_ready.sh
minecraft_chat.py		minecraft_chat.py
minecraft_demo.py		minecraft_demo.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup_check.py		setup_check.py
test_actions.py		test_actions.py
test_agent_screenshots.py		test_agent_screenshots.py
test_agent_vision.py		test_agent_vision.py
test_enhanced_vision.py		test_enhanced_vision.py
test_focus.py		test_focus.py
test_gpt4o_mini.py		test_gpt4o_mini.py
test_minecraft_integration.py		test_minecraft_integration.py
test_runner.py		test_runner.py
test_screenshots.py		test_screenshots.py
test_sima.py		test_sima.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SIMA Minecraft Agent

🎮 Conversational Minecraft Agent (NEW!)

Quick Start

Usage Examples

Example Conversation

Available Actions

🔬 Research Agent (Original)

🚀 Quick Start Options

Option 1: Conversational Agent (Recommended for most users)

Option 2: Research Agent (For AI researchers)

📚 Detailed Documentation

🏗️ Architecture (Research Agent)

MineRL Environments

Raw Control Mode

Development

Running Tests

Code Quality

TODOs for Production Use

Core System

Environment Integration

Intelligence Components

Minecraft-Specific Enhancements

License

About

Uh oh!

Releases

Packages

Languages

License

ethan0905/OpenSima

Folders and files

Latest commit

History

Repository files navigation

SIMA Minecraft Agent

🎮 Conversational Minecraft Agent (NEW!)

Quick Start

Usage Examples

Example Conversation

Available Actions

🔬 Research Agent (Original)

🚀 Quick Start Options

Option 1: Conversational Agent (Recommended for most users)

Option 2: Research Agent (For AI researchers)

📚 Detailed Documentation

🏗️ Architecture (Research Agent)

MineRL Environments

Raw Control Mode

Development

Running Tests

Code Quality

TODOs for Production Use

Core System

Environment Integration

Intelligence Components

Minecraft-Specific Enhancements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages