An embodied AI agent with personality, physical awareness, and autonomous behaviors.
Features • Installation • Usage • Architecture • Contributing • License
Assaultron Project ASR-7 is an advanced embodied AI agent that goes far beyond simple chatbots. Inspired by the Assaultron unit, ASR-7 features a layered, behavior-based architecture that allows it to:
- 🧠 Think with intention-based reasoning using LLMs (Ollama, Gemini, or OpenRouter)
- 🎭 Feel emotions and express personality through cognitive states
- 🤖 Embody a virtual (and optionally physical) body with postures, gestures, and movements
- 👁️ Perceive its environment through vision, speech recognition, and sensors
- 🗣️ Communicate via text-to-speech with a custom voice model
- 🎯 Act autonomously through behavior selection and utility-based decision-making
This is not just a conversational AI, it's a character with a body, emotions, and autonomous agency.
"Whatever it is you're looking for, I hope it's worth dying for." - ASR-7
- Multi-LLM Support: Ollama (local), Google Gemini, or OpenRouter
- Intent-Based Reasoning: AI reasons about goals and emotions, not hardware primitives
- Personality System: Consistent character personality with emotional states
- Memory System: Core memories, episodic memory, and context awareness
- Time Awareness: Understands temporal context and schedules
- Behavior Arbiter: Utility-based behavior selection (Intimidate, Friendly, Patrol, etc.)
- Virtual Body: Maintains posture, hand positions, LED states, and physical presence
- Motion Controller: Translates cognitive states into hardware commands (servos, LEDs)
- Gesture System: Dynamic body language and expressive movements
- Speech-to-Text: Real-time voice input via Mistral Voxtral
- Text-to-Speech: Custom voice synthesis using xVASynth
- Web Interface: Flask-based dashboard for monitoring and interaction
- Discord Integration: Optional Discord bot for remote interaction
- Vision System: Real-time object detection using TensorFlow Lite
- Face Detection: MediaPipe-based face tracking
- Environment Modeling: Tracks entities, threat levels, and spatial awareness
- Email Management: Send/receive emails autonomously
- GitHub Integration: Commit, push, and manage repositories
- Task Detection: Identify and track TODO items in conversations
- Sandbox Environment: Safe code execution environment
- Monitoring Dashboard: Real-time performance and state visualization
- ESP32/Arduino: Servo control for physical embodiment
- LED Control: Dynamic lighting for emotional expression
- Serial Communication: Hardware bridge for real-time control
- Python 3.9+ (tested on 3.9-3.11)
- LLM Backend (choose one):
- Ollama (recommended for local/offline use)
- Google Gemini API key
- OpenRouter API key
- Audio (optional, for voice features):
- PyAudio-compatible system
- xVASynth for TTS
- Hardware (optional):
- ESP32 or Arduino board
- Servo motors and LEDs
-
Clone the repository
git clone https://github.com/CamoLover/AssaultronProject.git cd AssaultronProject -
Create a virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Configure environment variables
cp .env.example .env # Edit .env with your API keys and preferences -
Start Ollama (if using local LLM)
ollama pull gemma3:4b ollama serve
-
Run the agent
python main.py
-
Access the web interface
- Open your browser to
http://localhost:8080 - Default credentials:
admin/your_secure_password_here(set in.env)
- Open your browser to
Once running, you can interact with ASR-7 through:
-
Web Interface (
http://localhost:8080)- Chat interface with real-time responses
- View current cognitive state, emotions, and body posture
- Monitor system metrics and logs
-
Voice Interaction (if configured)
- Enable STT in the web interface
- Speak directly to ASR-7
- Hear responses via TTS
-
Discord Bot (if configured)
- Interact from Discord servers
- Use commands and conversational queries
Memory Management:
# Access via web interface or API
POST /api/memory/core
{
"memory": "User prefers direct communication",
"importance": 8
}Custom Behaviors:
- Add new behaviors in
src/behavioral_layer.py - Implement behavior utilities and execution logic
- Behaviors are automatically selected based on cognitive state
Hardware Integration:
- Configure servo mappings in
src/motion_controller.py - Connect ESP32/Arduino via serial
- Physical body movements mirror virtual body state
ASR-7 uses a layered architecture inspired by robotics and cognitive science:
┌─────────────────────────────────────────────────────────┐
│ COGNITIVE LAYER │
│ • LLM reasoning (Ollama/Gemini/OpenRouter) │
│ • Outputs: CognitiveState (Goal, Emotion, Urgency) │
│ • No hardware knowledge only intentions │
└─────────────────────┬───────────────────────────────────┘
│
┌─────────────────────▼───────────────────────────────────┐
│ BEHAVIORAL LAYER │
│ • Behavior Arbiter (utility-based selection) │
│ • Behaviors: Intimidate, Friendly, Patrol, Curious │
│ • Selects best action based on cognitive state │
└─────────────────────┬───────────────────────────────────┘
│
┌─────────────────────▼───────────────────────────────────┐
│ VIRTUAL BODY / WORLD MODEL │
│ • Body state: posture, hands, LEDs, luminance │
│ • Environment: entities, threats, spatial awareness │
│ • Self-model: physical and emotional state │
└─────────────────────┬───────────────────────────────────┘
│
┌─────────────────────▼───────────────────────────────────┐
│ MOTION LAYER │
│ • Translates virtual states → hardware commands │
│ • Servo angles, LED PWM, physical movements │
│ • Hardware abstraction layer │
└─────────────────────────────────────────────────────────┘
| Module | Purpose |
|---|---|
main.py |
Flask server, web interface, main loop |
src/cognitive_layer.py |
LLM interface, intent reasoning, CognitiveState |
src/behavioral_layer.py |
Behavior selection, utility arbiter, action execution |
src/virtual_body.py |
Virtual body state, postures, world model |
src/motion_controller.py |
Hardware translation, servo/LED control |
src/voicemanager.py |
TTS/audio generation and playback |
src/stt_manager.py |
Speech-to-text (Mistral Voxtral) |
src/vision_system.py |
Object detection, face tracking |
src/agent_tools.py |
Tool system (email, git, web search, etc.) |
src/monitoring_service.py |
Performance metrics, system monitoring |
For detailed implementation notes, see docs/ARCHITECTURE.md.
AssaultronProject/
├── src/
│ ├── cognitive_layer.py # LLM & intent reasoning
│ ├── behavioral_layer.py # Behavior selection & arbiter
│ ├── virtual_body.py # Virtual body state
│ ├── motion_controller.py # Hardware translation
│ ├── voicemanager.py # TTS engine
│ ├── stt_manager.py # Speech-to-text
│ ├── vision_system.py # Computer vision
│ ├── agent_tools.py # Tool implementations
│ ├── agent_logic.py # Agent orchestration
│ ├── config.py # Configuration & prompts
│ ├── email_manager.py # Email integration
│ ├── git_manager.py # GitHub integration
│ ├── sandbox_manager.py # Safe code execution
│ ├── monitoring_service.py # System monitoring
│ ├── templates/ # Web UI templates
│ └── discord/ # Discord bot integration
├── ai-data/
│ ├── core_memories/ # Persistent memory storage
│ └── context/ # Conversation context
├── Content/
│ └── xVAsynth/ # Voice synthesis models
├── docs/ # Documentation
├── main.py # Application entry point
├── run.py # Quick launcher
├── requirements.txt # Python dependencies
├── .env.example # Environment template
└── LICENSE # MIT License
Contributions are welcome! Whether you want to add new behaviors, improve existing systems, or fix bugs, your help is appreciated.
-
Fork the repository
git fork https://github.com/CamoLover/AssaultronProject.git
-
Create a feature branch
git checkout -b feature/amazing-new-behavior
-
Make your changes
- Follow existing code style
- Add comments for complex logic
- Update documentation if needed
-
Test your changes
python main.py # Ensure the agent still runs # Test your specific feature thoroughly
-
Commit your changes
git add . git commit -m "feat: add amazing new behavior"
-
Push and create a Pull Request
git push origin feature/amazing-new-behavior
- Code Quality: Keep code clean, readable, and well-documented
- Modularity: Follow the layered architecture pattern
- Character Consistency: Maintain ASR-7's personality and character
- Testing: Test changes thoroughly before submitting
- Documentation: Update README/docs for significant changes
- 🆕 New Behaviors: Add behaviors to the behavioral layer
- 🎨 UI Improvements: Enhance the web dashboard
- 🔧 Hardware Support: Update and make new hardware integrations
- 🧠 LLM Providers: Support additional LLM backends
- 🌍 Localization: Multi-language support
- 📚 Documentation: Improve guides and tutorials
- 🐛 Bug Fixes: Fix issues and improve stability
LLM Connection Errors
# Ensure Ollama is running (if using local)
ollama serve
# Check if model is pulled
ollama list
ollama pull gemma3:4bAudio/Voice Issues
# Install PyAudio dependencies (Ubuntu/Debian)
sudo apt-get install portaudio19-dev python3-pyaudio
# Windows: Download PyAudio wheel
pip install pipwin
pipwin install pyaudioPermission Errors
# Ensure proper file permissions
chmod +x run.pyPort Already in Use
# Change Flask port in main.py or .env
# Default is 8080This project is licensed under the MIT License - see the LICENSE file for details.
Copyright (c) 2026 Evan Escabasse.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction...
- Ollama for local LLM inference
- Google Gemini for advanced reasoning capabilities
- Mistral AI for speech-to-text (Voxtral) & Mistral Large 3 for LLM via Openrouter
- xVASynth for character voice synthesis
- MediaPipe for face detection
- TensorFlow for object detection
- The open-source community for invaluable tools and libraries
- Issues: GitHub Issues
- Discussions: GitHub Discussions
"Keep the personality intact. ASR-7 is not just a bot; it's a character."