Skip to content

ethan0905/OpenSima

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

14 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

SIMA Minecraft Agent

Two powerful approaches to AI agents for Minecraft:

๐ŸŽฎ Conversational Minecraft Agent (NEW!)

Chat with an AI that plays Minecraft for you!

A conversational assistant that:

  • ๐Ÿ’ฌ Chats in natural language: "Go mine that tree" or "Build a house here"
  • ๐ŸŽฎ Controls real Minecraft: Direct keyboard/mouse automation
  • ๐Ÿค– Executes complex tasks: Understands multi-step instructions
  • ๐Ÿง  Learns from conversation: Remembers context and preferences

Perfect for: Playing Minecraft with an AI companion, automating repetitive tasks, learning Minecraft mechanics

Quick Start

# Install dependencies for chat agent
pip install loguru mss pyautogui pynput openai

# Check installation
python minecraft_chat.py --check-deps

# Start the conversational agent
python minecraft_chat.py

Usage Examples

# With OpenAI API key (recommended)
export OPENAI_API_KEY="your-api-key"
python minecraft_chat.py

# Basic mode (no API key needed)
python minecraft_chat.py

Example Conversation

๐Ÿ‘ค You: Go forward and mine some wood
๐Ÿค– Agent: I'll help you mine those trees! Moving forward and breaking the wood blocks.

๐Ÿ‘ค You: Now build a small house with that wood
๐Ÿค– Agent: Perfect! I'll use the wood we collected to build a cozy house. Starting with a 5x5 foundation...

Available Actions

  • Movement: "go forward", "turn left", "walk backward"
  • Mining: "break this block", "mine that tree", "dig down"
  • Building: "place a block", "build a wall", "make a platform"
  • Complex: "build a house", "find diamonds", "make a farm"

๐Ÿ”ฌ Research Agent (Original)

Self-improving agent for AI research

A modular research framework inspired by Google's SIMA 2:

  • Task Setter: Proposes tasks with estimated rewards
  • Agent: Executes tasks in game environments
  • Reward Model: Evaluates episode performance
  • Self-Generated Experience: Stores and manages episode data
  • Self-Improvement Loop: Orchestrates the learning cycle

Perfect for: AI research, reinforcement learning experiments, academic studies

๐Ÿš€ Quick Start Options

Option 1: Conversational Agent (Recommended for most users)

# Install dependencies
pip install loguru mss pyautogui pynput openai

# Start chatting with your Minecraft assistant  
python minecraft_chat.py

# Example conversation:
# You: "go forward and mine some wood"
# Agent: "I'll move forward and break those wood blocks for you!"

Option 2: Research Agent (For AI researchers)

# Basic installation
pip install -e .

# Train the research agent
python -m src.main --mode train --env dummy --generations 5

# Train with Minecraft (requires MineRL)
pip install minerl
python -m src.main --mode train --env minecraft --generations 3

๐Ÿ“š Detailed Documentation

๐Ÿ—๏ธ Architecture (Research Agent)

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Task Setter โ”‚โ”€โ”€โ”€โ”€โ”‚    Agent     โ”‚โ”€โ”€โ”€โ”€โ”‚ Game Env    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
       โ”‚                   โ”‚
       โ”‚            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”
       โ”‚            โ”‚  Episodes   โ”‚
       โ”‚            โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
       โ”‚                   โ”‚
       โ–ผ            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”‚ Self-Gen.   โ”‚     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚Reward Model โ”‚โ—„โ”€โ”€โ”€โ”€โ”‚ Experience  โ”‚โ”€โ”€โ”€โ”€โ–บโ”‚   Storage   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ”‚ โ”œโ”€โ”€ config/ โ”‚ โ”‚ โ””โ”€โ”€ config.py # Configuration classes โ”‚ โ”œโ”€โ”€ env/ โ”‚ โ”‚ โ”œโ”€โ”€ base_env.py # Abstract game environment โ”‚ โ”‚ โ”œโ”€โ”€ dummy_env.py # Simple test environment โ”‚ โ”‚ โ”œโ”€โ”€ minecraft_env.py # Minecraft environment wrapper โญ โ”‚ โ”‚ โ”œโ”€โ”€ io_controller.py # Game I/O interfaces โ”‚ โ”‚ โ””โ”€โ”€ vision.py # Observation encoding โ”‚ โ”œโ”€โ”€ agent/ โ”‚ โ”‚ โ”œโ”€โ”€ policy.py # Policy implementations โ”‚ โ”‚ โ””โ”€โ”€ agent.py # Main agent orchestrator โ”‚ โ”œโ”€โ”€ tasks/ โ”‚ โ”‚ โ”œโ”€โ”€ task_schema.py # Task data structures โ”‚ โ”‚ โ””โ”€โ”€ task_setter.py # Task generation logic โญ โ”‚ โ”œโ”€โ”€ reward/ โ”‚ โ”‚ โ””โ”€โ”€ reward_model.py # Episode scoring โญ โ”‚ โ”œโ”€โ”€ experience/ โ”‚ โ”‚ โ”œโ”€โ”€ types.py # Core data types โ”‚ โ”‚ โ”œโ”€โ”€ buffer.py # In-memory experience store โ”‚ โ”‚ โ””โ”€โ”€ storage.py # Persistent experience store โ”‚ โ”œโ”€โ”€ training/ โ”‚ โ”‚ โ””โ”€โ”€ self_improvement_loop.py # Main training loop โ”‚ โ””โ”€โ”€ utils/ โ”‚ โ”œโ”€โ”€ logging_utils.py # Structured logging โ”‚ โ””โ”€โ”€ seed.py # Random seed management โ””โ”€โ”€ tests/ โ”œโ”€โ”€ test_experience.py # Experience system tests โ””โ”€โ”€ test_agent_interfaces.py # Agent integration tests


## Key Components

### Environment Interface
The `GameEnv` abstract base class provides a clean interface for any video game. Includes:
- `DummyGameEnv` for testing and development
- `MinecraftEnv` for Minecraft gameplay via MineRL or raw control

### Minecraft Support
The system now supports Minecraft through two modes:
- **MineRL**: Uses the MineRL research platform for structured Minecraft gameplay
- **Raw Control**: Direct keyboard/mouse control of Minecraft client (experimental)

Minecraft-specific features:
- Specialized action space for movement, building, mining, crafting
- Minecraft vision encoder for processing pixel observations
- Task templates for common Minecraft objectives (collect wood, build structures, etc.)
- Reward model understanding Minecraft progress indicators

### Experience System
Episodes are stored both in-memory (`ReplayBuffer`) and persistently (`storage.py`) as the "Self-Generated Experience" that drives learning.

### Task Generation
The `TaskSetter` proposes new tasks based on previous performance, with hooks for LLM-based task generation.

### Reward Learning
The `RewardModel` scores episodes, with clear interfaces for plugging in learned reward functions or LLM-based evaluation. Now includes Minecraft-specific scoring based on inventory changes, health preservation, and exploration.

## Minecraft Configuration

The Minecraft environment can be configured via CLI arguments or config files:

```python
# Example configuration
minecraft_config = MinecraftConfig(
    backend="minerl",  # "minerl" or "raw" 
    environment_name="MineRLNavigateDense-v0",  # MineRL environment
    max_steps=1000,
    frame_skip=1,
    render=True,
    action_space="discrete"  # "discrete" or "continuous"
)

MineRL Environments

Supported MineRL environments include:

  • MineRLNavigateDense-v0: Navigation with dense rewards
  • MineRLTreechop-v0: Tree chopping tasks
  • MineRLObtainDiamond-v0: Complex diamond obtaining task

Raw Control Mode

For direct Minecraft control:

  1. Start Minecraft client
  2. Use --env minecraft --minecraft-backend raw
  3. Ensure Minecraft window is focused and accessible

Note: Raw control mode is experimental and requires additional setup.

Development

Running Tests

pytest tests/

Code Quality

black src/ tests/
ruff src/ tests/
mypy src/

TODOs for Production Use

Core System

  1. RL Algorithms: Implement proper policy gradient/Q-learning in policy.py
  2. Scalability: Add distributed training and experience storage
  3. Vision Models: Plug in CNN/transformer backbones in vision.py

Environment Integration

  1. More Games: Add support for other games beyond Minecraft
  2. Real Input Control: Complete the raw Minecraft controller implementation
  3. MineRL Integration: Test and optimize MineRL environment performance

Intelligence Components

  1. Reward Learning: Train neural reward models or integrate LLM evaluation
  2. Task Generation: Add LLM-based creative task generation
  3. Hierarchical Tasks: Implement task decomposition and sub-goals

Minecraft-Specific Enhancements

  1. Advanced Actions: Implement crafting, building, combat mechanics
  2. State Detection: Add inventory tracking, health monitoring, environment analysis
  3. Long-Term Goals: Add support for complex, multi-step Minecraft objectives

License

MIT License - see LICENSE file for details.

About

OpenSourcing SIMA 2 Google's Agent for all of humanity.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages