A Kedro-based AI browser automation project that generates synthetic data through intelligent web interactions. This project combines AI agents with web browser automation to create realistic test data and perform automated tasks on websites.
The Synthetic Data Agent is designed to:
- Automate web browser interactions using AI-powered decision making
- Generate synthetic test data through real web workflows
- Record and replay browser sessions for testing purposes
- Provide a web-based dashboard for managing agent tasks and recordings
- Support both AI-driven (LLM) and script-based automation modes
synthetic-data-agent/
โโโ conf/ # Configuration files
โ โโโ base/
โ โ โโโ catalog.yml # Data catalog definitions
โ โ โโโ parameters.yml # Default parameters
โ โโโ local/
โ โ โโโ credentials.yml # API credentials (not in git)
โ โ โโโ README.md # Configuration instructions
โโโ data/ # Data layers following Kedro conventions
โ โโโ 01_raw/
โ โ โโโ test_scripts/ # Input test scripts (JSON format)
โ โโโ 08_reporting/
โ โโโ recordings/ # Browser session recordings (.vbrec files)
โ โโโ metadata/ # Recording metadata (JSON files)
โโโ src/synthetic_data_agent/ # Source code
โ โโโ __init__.py
โ โโโ __main__.py # CLI entry point
โ โโโ pipelines/ # Kedro pipelines
โ โ โโโ data_generation/ # Main pipeline
โ โ โโโ __init__.py
โ โ โโโ pipeline.py # Pipeline definition
โ โ โโโ nodes.py # Pipeline nodes
โ โ โโโ browser_agent.py # AI browser agent implementation
โ โ โโโ browser_analyzer.py # Session analysis
โ โ โโโ browser_recorder.py # Session recording
โ โ โโโ browser_replayer.py # Session replay functionality
โ โโโ settings.py # Kedro settings
โโโ templates/ # HTML templates
โ โโโ index.html # Dashboard HTML
โ โโโ replayer.html # Replay viewer HTML
โโโ main.py # FastAPI web server
โโโ pyproject.toml # Project configuration
โโโ requirements.txt # Python dependencies
โโโ .env # Environment variables (not in git)
โโโ .gitignore # Git ignore rules
โโโ README.md # This file
- Python 3.9 or higher
- pip for package management
- Chrome/Chromium browser (for Playwright)
- Azure OpenAI API access (for AI-driven mode)
-
Navigate to the project directory:
cd /path/to/synthetic-data-agent -
Install the project in development mode:
pip install -e .โ ๏ธ Important: This step is critical for proper imports to work. -
Install additional dependencies (can be skipped):
pip install -r requirements.txt
โ ๏ธ Important: This step can be skipped unless there is a local requirements.txt file -
Install Playwright browsers:
npx playwright install
-
Set up environment variables: Create a
.envfile in the project root:# Disable Kedro telemetry KEDRO_DISABLE_TELEMETRY=true DO_NOT_TRACK=1 # Azure OpenAI credentials AZURE_OPENAI_API_KEY=your_actual_api_key_here AZURE_OPENAI_RESOURCE_NAME=your_azure_resource_name AZURE_OPENAI_DEPLOYMENT_NAME=your_model_deployment_name AZURE_OPENAI_API_VERSION=your_azure_api_version
-
Configure API credentials: Create
conf/local/credentials.yml:azure_openai: api_key: ${oc.env:AZURE_OPENAI_API_KEY} resource_name: ${oc.env:AZURE_OPENAI_RESOURCE_NAME} deployment_name: ${oc.env:AZURE_OPENAI_DEPLOYMENT_NAME} api_version: ${oc.env:AZURE_OPENAI_API_VERSION}
Start the FastAPI web server:
python main.pyThen open http://localhost:8000 in your browser to access the dashboard where you can:
- Submit new agent tasks with custom URLs and descriptions
- View and download recordings with metadata
- Replay browser sessions using the integrated viewer
- Browse and manage test scripts
- Monitor agent performance and analysis results
You can also run the agent in debug mode (ensure Kedro is installed):
# FastAPI debug mode
uvicorn main:app --reloadRun the Kedro pipeline directly using command line:
Use default parameters from conf/base/parameters.yml:
kedro runOverride parameters at runtime:
kedro run --params="agent_params.task='Find the 'More information...' link, click it to open new page then indicate you are done, If you cannot find the link stop and indicate done.',agent_params.url='https://example.com',agent_params.mode='llm'"Run with custom configuration:
# First, update conf/base/parameters.yml with your desired task
# Then run:
kedro runNote: kedro run executes the pipeline directly without starting a web server. It uses parameters from the configuration files and is ideal for:
- Automated/batch processing
- CI/CD integration
- Command-line scripting
- Testing with fixed parameters
For interactive development, use Option 1 (Web Dashboard) instead.
Configure agent behavior in conf/base/parameters.yml:
agent_params:
task: "Your automation task description"
url: "https://target-website.com"
maxRetries: 15 # Number of retry attempts
mode: "llm" # "llm" for AI-driven, "script" for predefined
headless: false # true for headless browser operation
scriptName: null # filename in test_scripts/ for script modeThe data catalog in conf/base/catalog.yml defines data sources and outputs:
test_scripts: Input JSON test scripts (PartitionedDataset)agent_recordings: Output browser session recordings as .vbrec filesagent_metadata: Recording metadata and analysis results as JSON
- Agent analyzes the current page state
- Sends simplified HTML to Azure OpenAI
- Receives and executes action sequences
- Records all interactions for later analysis
- Follows a JSON script of predefined actions
- Useful for regression testing and consistent workflows
- Scripts are stored in
data/01_raw/test_scripts/
AIBrowserAgent: Main orchestrator for browser automationAIAgentBrowserRecorder: Captures DOM events using rrwebAIAgentAnalyzer: Evaluates agent performance using AIAIAgentBrowserReplay: Replays recorded sessions
- Node:
run_browser_agent- Executes the agent and returns outputs - Inputs: Parameters and API configuration
- Outputs: Session recordings and metadata
- New Action Types: Extend the action handlers in
browser_agent.py - New Analysis Metrics: Modify the analyzer prompts in
browser_analyzer.py - New Data Sources: Add datasets to
conf/base/catalog.yml
- Input: User provides task description and target URL
- Agent Initialization: Creates browser context and loads AI models
- Task Execution:
- LLM Mode: AI analyzes page โ generates actions โ executes โ repeats
- Script Mode: Follows predefined action sequence
- Recording: All DOM events captured via rrweb
- Analysis: AI evaluates performance against original task
- Output:
.vbrecfile containing session recording.jsonfile containing metadata and analysis
The FastAPI server provides these REST endpoints:
GET /: Dashboard homepage with task submission formPOST /start-agent: Start a new agent task{ "task": "Navigate to the login page and sign in", "url": "https://example.com", "mode": "llm", "headless": false }GET /recordings: List all recordings with pagination and metadataPOST /replay: Replay a specific recording in browserGET /download/{filename}: Download recording and metadata as ZIPGET /test-scripts: List available test scriptsGET /test-scripts/{filename}: Retrieve specific test script
# Kedro Configuration
KEDRO_DISABLE_TELEMETRY=true
DO_NOT_TRACK=1
# Azure OpenAI Service
AZURE_OPENAI_API_KEY=your_api_key
AZURE_OPENAI_RESOURCE_NAME=your_resource_name
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4 # or your deployment
AZURE_OPENAI_API_VERSION=2024-08-01-preview
ModuleNotFoundError: No module named 'synthetic_data_agent'
# Solution: Install in development mode
pip install -e . --break-system-packages --ignore-requires-pythonPlaywright browsers not found
# Solution: Install browser binaries
npx playwright installkedro-datasets not found
# Solution: Install the datasets package
pip install kedro-datasetsInterpolation key 'AZURE_OPENAI_API_KEY' not found
- Check that
.envfile exists in project root - Verify environment variables don't have quotes around values
- Ensure
python-dotenvis installed
Pipeline input 'credentials:azure_openai' not found
- Verify
conf/local/credentials.ymlexists and has correct structure - Check that environment variables are properly loaded
Agent gets stuck or fails
- Try running in non-headless mode:
headless: false - Reduce
maxRetriesfor faster debugging - Check console for errors
Recording playback fails
- Ensure HTML templates are not being ignored by Git
- Verify
templates/replayer.htmlexists and is properly formatted - Check console for rrweb-player loading errors
- Configuration Guide: See
conf/README.mdfor detailed setup instructions - API Reference: All endpoints documented with OpenAPI at
/docs - Data Catalog: Detailed dataset definitions in
conf/base/catalog.yml
[Add your license information here]
For issues and questions:
- Check troubleshooting section above for common problems
- Review configuration files in
conf/directory - Enable debug logging for detailed error information
- Open an issue in the repository with error logs and steps to reproduce
- Environment variables in
.envshould not have quotes - HTML templates need Git ignore exceptions to be tracked
- Agent requires internet access for AI API calls
Then open http://localhost:8000 in your browser to access the dashboard where you can:
- Submit new agent tasks
- View and download recordings
- Replay browser sessions
- Manage test scripts
Note: This project is actively developed. Make sure to run pip install -e . --break-system-packages after any changes to the source code to ensure imports work correctly.