An AI-powered web automation interface that allows you to control browser interactions through natural language tasks.
- π€ AI-Powered Browser Automation: Uses advanced AI models to understand and execute web tasks
- π Beautiful Web Interface: Clean, modern UI for task submission
- π₯οΈ Real-time Browser Console: Visual feedback through integrated browser console
- β‘ FastAPI Backend: High-performance async web framework
- π Live Status Updates: Real-time task monitoring
- π³ DevContainer Ready: Pre-configured development environment
- π» Dual Interface: Web UI and CLI options
The web interface provides a beautiful, user-friendly way to interact with the AI agent:
- File:
app_browser.py - Access:
http://localhost:5000 - Features:
- Modern, responsive UI
- Real-time status updates
- Automatic browser console opening
- Task examples and suggestions
- Visual feedback
For quick tasks and automation scripts:
- File:
app_cli.py - Usage: Direct task execution
- Features:
- Faster startup
- Script-friendly
- Command-line arguments support
- Perfect for automation
This project includes a complete DevContainer setup with everything pre-configured:
- Python 3.12 environment
- All dependencies pre-installed
- Browser automation tools (Playwright, Chromium)
- Desktop environment with VNC access
- Azure CLI integration
Requirements:
Setup:
- Clone this repository
- Open in VS Code
- Click "Reopen in Container" when prompted
- Wait for the container to build (first time only)
- You're ready to go! π
If you prefer to run locally without Docker:
Requirements:
- Python 3.8 or higher
- pip package manager
- Node.js (for Playwright browser automation)
- Git
Installation steps:
-
Install Python dependencies:
pip install -r requirements.txt
-
Install Playwright browsers:
playwright install-deps playwright install chromium
-
Set up environment:
cp .env.example .env # Edit .env with your Azure OpenAI credentials
./run.shThis script will:
- Check for the
.envfile and create it if needed - Show you all available options
- Start the application based on your choice
-
Start the web application:
python3 app_browser.py
-
Open your browser and navigate to
http://localhost:5000 -
Enter a task and watch the AI navigate for you!
-
Run with interactive prompt:
python3 app_cli.py
-
Or pass task as argument:
python3 app_cli.py "Search for Python jobs on LinkedIn"
Create a .env file with your Azure OpenAI credentials:
AZURE_OPENAI_DEPLOYMENT=your-deployment-name
AZURE_OPENAI_KEY=your-api-keyHow to get Azure OpenAI credentials:
- Go to Azure Portal
- Create an Azure OpenAI resource
- Deploy a model (e.g., GPT-4o)
- Get your API key and deployment name from the resource
Here are some tasks you can try with the AI agent:
- "Compare the pricing of GPT-4o and DeepSeek-V3 on their official websites"
- "Find and compare the features of the top 3 project management tools"
- "Research the latest trends in artificial intelligence"
- "Search for the latest AI news on TechCrunch"
- "Find recent articles about sustainable energy on BBC News"
- "Get the latest cryptocurrency market updates"
- "Find the current weather in New York and London"
- "Check the weather forecast for this weekend in Paris"
- "Find tourist attractions in Tokyo"
- "Search for Python developer job openings on LinkedIn"
- "Find remote software engineering positions"
- "Look for data science internships in California"
- "Compare laptop prices on different e-commerce sites"
- "Find the best-rated smartphones under $500"
- "Search for eco-friendly cleaning products"
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Web Browser β β FastAPI App β β Azure OpenAI β
β (localhost:5000)βββββΊβ (app_browser.py)βββββΊβ Service β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βΌ
βββββββββββββββββββ βββββββββββββββββββ
β Browser-Use βββββΊβ Browser Console β
β Library β β (localhost:6080)β
βββββββββββββββββββ βββββββββββββββββββ
- Frontend: Modern HTML/CSS/JavaScript interface with responsive design
- Backend: FastAPI with async support for high performance
- AI Engine: Azure OpenAI integration via browser-use library
- Browser Control: Automated browser interactions with Playwright
- Visual Feedback: Real-time browser console with VNC access
The project is optimized for DevContainer development:
- Auto-reload: Changes are automatically detected and applied
- Pre-configured: Everything set up and ready to use
- Isolated: Clean environment without affecting your host system
- Portable: Same environment for all developers
If developing locally:
# Start with auto-reload
python3 app_browser.py
# The server will restart automatically when you make changesThe browser console is available at http://localhost:6080 and provides:
- Real-time visual feedback of AI actions
- Desktop environment accessible via web browser
- VNC connection for advanced debugging
- Automatic opening when tasks are submitted
1. ModuleNotFoundError: No module named 'fastapi'
# Solution: Install dependencies
pip install -r requirements.txt2. Azure OpenAI Authentication Error
# Solution: Check your .env file
cp .env.example .env
# Edit .env with correct credentials3. Browser Console Not Opening
- Check if port 6080 is available
- Ensure Docker Desktop is running (for DevContainer)
- Try manually opening
http://localhost:6080
4. Port 5000 Already in Use
# Solution: Kill existing processes
pkill -f "python3 app_browser.py"
# Or use a different port in app_browser.pyContainer won't start:
- Ensure Docker Desktop is running
- Check available disk space
- Try rebuilding:
Ctrl+Shift+Pβ "Dev Containers: Rebuild Container"
Performance issues:
- Allocate more resources to Docker Desktop
- Close unnecessary applications
- Use local installation if DevContainer is too slow
- Visual Studio Code
- Docker Desktop (4GB RAM minimum)
- Dev Containers Extension
- Python 3.8+ (3.12 recommended)
- Node.js 16+ (for Playwright)
- Git
- 4GB RAM minimum
- Azure OpenAI account and API key
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Use the DevContainer for consistent environment
- Make your changes
- Test both web and CLI interfaces
- Submit a pull request
This project is open source and available under the MIT License.
- π Documentation: Check this README for detailed instructions
- π Issues: Report bugs or request features via GitHub Issues
- π¬ Discussions: Share ideas and ask questions in GitHub Discussions
Kiko de Angel Made with β€οΈ for the AI automation community



