Skip to content

A Kedro-based AI browser automation project that generates synthetic data through intelligent web interactions. This project combines AI agents with web browser automation to create realistic test data and perform automated tasks on websites.

Notifications You must be signed in to change notification settings

deaspo/synthetic_data_agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

1 Commit
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Synthetic Data Agent - Kedro Project

A Kedro-based AI browser automation project that generates synthetic data through intelligent web interactions. This project combines AI agents with web browser automation to create realistic test data and perform automated tasks on websites.

๐ŸŽฏ Project Overview

The Synthetic Data Agent is designed to:

  • Automate web browser interactions using AI-powered decision making
  • Generate synthetic test data through real web workflows
  • Record and replay browser sessions for testing purposes
  • Provide a web-based dashboard for managing agent tasks and recordings
  • Support both AI-driven (LLM) and script-based automation modes

๐Ÿ—๏ธ Project Structure

synthetic-data-agent/
โ”œโ”€โ”€ conf/                          # Configuration files
โ”‚   โ”œโ”€โ”€ base/
โ”‚   โ”‚   โ”œโ”€โ”€ catalog.yml           # Data catalog definitions
โ”‚   โ”‚   โ””โ”€โ”€ parameters.yml        # Default parameters
โ”‚   โ””โ”€โ”€ local/
โ”‚   โ”‚   โ”œโ”€โ”€ credentials.yml        # API credentials (not in git)
โ”‚   โ”‚   โ””โ”€โ”€ README.md             # Configuration instructions
โ”œโ”€โ”€ data/                         # Data layers following Kedro conventions
โ”‚   โ”œโ”€โ”€ 01_raw/
โ”‚   โ”‚   โ””โ”€โ”€ test_scripts/         # Input test scripts (JSON format)
โ”‚   โ””โ”€โ”€ 08_reporting/
โ”‚       โ”œโ”€โ”€ recordings/           # Browser session recordings (.vbrec files)
โ”‚       โ””โ”€โ”€ metadata/             # Recording metadata (JSON files)
โ”œโ”€โ”€ src/synthetic_data_agent/     # Source code
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ __main__.py              # CLI entry point
โ”‚   โ”œโ”€โ”€ pipelines/               # Kedro pipelines
โ”‚   โ”‚   โ””โ”€โ”€ data_generation/     # Main pipeline
โ”‚   โ”‚       โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚       โ”œโ”€โ”€ pipeline.py      # Pipeline definition
โ”‚   โ”‚       โ”œโ”€โ”€ nodes.py         # Pipeline nodes
โ”‚   โ”‚       โ”œโ”€โ”€ browser_agent.py # AI browser agent implementation
โ”‚   โ”‚       โ”œโ”€โ”€ browser_analyzer.py # Session analysis
โ”‚   โ”‚       โ”œโ”€โ”€ browser_recorder.py # Session recording
โ”‚   โ”‚       โ””โ”€โ”€ browser_replayer.py # Session replay functionality
โ”‚   โ””โ”€โ”€ settings.py              # Kedro settings
โ”œโ”€โ”€ templates/                   # HTML templates
โ”‚   โ”œโ”€โ”€ index.html              # Dashboard HTML
โ”‚   โ””โ”€โ”€ replayer.html           # Replay viewer HTML
โ”œโ”€โ”€ main.py                     # FastAPI web server
โ”œโ”€โ”€ pyproject.toml             # Project configuration
โ”œโ”€โ”€ requirements.txt           # Python dependencies
โ”œโ”€โ”€ .env                       # Environment variables (not in git)
โ”œโ”€โ”€ .gitignore                 # Git ignore rules
โ””โ”€โ”€ README.md                  # This file

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.9 or higher
  • pip for package management
  • Chrome/Chromium browser (for Playwright)
  • Azure OpenAI API access (for AI-driven mode)

Installation

  1. Navigate to the project directory:

    cd /path/to/synthetic-data-agent
  2. Install the project in development mode:

    pip install -e .

    โš ๏ธ Important: This step is critical for proper imports to work.

  3. Install additional dependencies (can be skipped):

    pip install -r requirements.txt

    โš ๏ธ Important: This step can be skipped unless there is a local requirements.txt file

  4. Install Playwright browsers:

    npx playwright install
  5. Set up environment variables: Create a .env file in the project root:

    # Disable Kedro telemetry
    KEDRO_DISABLE_TELEMETRY=true
    DO_NOT_TRACK=1
    
    # Azure OpenAI credentials
    AZURE_OPENAI_API_KEY=your_actual_api_key_here
    AZURE_OPENAI_RESOURCE_NAME=your_azure_resource_name
    AZURE_OPENAI_DEPLOYMENT_NAME=your_model_deployment_name
    AZURE_OPENAI_API_VERSION=your_azure_api_version
  6. Configure API credentials: Create conf/local/credentials.yml:

    azure_openai:
      api_key: ${oc.env:AZURE_OPENAI_API_KEY}
      resource_name: ${oc.env:AZURE_OPENAI_RESOURCE_NAME}
      deployment_name: ${oc.env:AZURE_OPENAI_DEPLOYMENT_NAME}
      api_version: ${oc.env:AZURE_OPENAI_API_VERSION}

๐ŸŽฎ Usage

Option 1: Web Dashboard (Recommended)

Start the FastAPI web server:

python main.py

Then open http://localhost:8000 in your browser to access the dashboard where you can:

  • Submit new agent tasks with custom URLs and descriptions
  • View and download recordings with metadata
  • Replay browser sessions using the integrated viewer
  • Browse and manage test scripts
  • Monitor agent performance and analysis results

Option 2: Debug Mode

You can also run the agent in debug mode (ensure Kedro is installed):

# FastAPI debug mode
uvicorn main:app --reload

Option 3: Direct Pipeline Execution

Run the Kedro pipeline directly using command line:

Use default parameters from conf/base/parameters.yml:

kedro run

Override parameters at runtime:

kedro run --params="agent_params.task='Find the 'More information...' link, click it to open new page then indicate you are done, If you cannot find the link stop and indicate done.',agent_params.url='https://example.com',agent_params.mode='llm'"

Run with custom configuration:

# First, update conf/base/parameters.yml with your desired task
# Then run:
kedro run

Note: kedro run executes the pipeline directly without starting a web server. It uses parameters from the configuration files and is ideal for:

  • Automated/batch processing
  • CI/CD integration
  • Command-line scripting
  • Testing with fixed parameters

For interactive development, use Option 1 (Web Dashboard) instead.

๐Ÿ”ง Configuration

Agent Parameters

Configure agent behavior in conf/base/parameters.yml:

agent_params:
  task: "Your automation task description"
  url: "https://target-website.com"
  maxRetries: 15              # Number of retry attempts
  mode: "llm"                 # "llm" for AI-driven, "script" for predefined
  headless: false             # true for headless browser operation
  scriptName: null            # filename in test_scripts/ for script mode

Data Catalog Configuration

The data catalog in conf/base/catalog.yml defines data sources and outputs:

  • test_scripts: Input JSON test scripts (PartitionedDataset)
  • agent_recordings: Output browser session recordings as .vbrec files
  • agent_metadata: Recording metadata and analysis results as JSON

Operating Modes

1. LLM Mode (AI-Driven)

  • Agent analyzes the current page state
  • Sends simplified HTML to Azure OpenAI
  • Receives and executes action sequences
  • Records all interactions for later analysis

2. Script Mode (Predefined Actions)

  • Follows a JSON script of predefined actions
  • Useful for regression testing and consistent workflows
  • Scripts are stored in data/01_raw/test_scripts/

๐Ÿงช Development

Key Components

Core Classes

  • AIBrowserAgent: Main orchestrator for browser automation
  • AIAgentBrowserRecorder: Captures DOM events using rrweb
  • AIAgentAnalyzer: Evaluates agent performance using AI
  • AIAgentBrowserReplay: Replays recorded sessions

Pipeline Architecture

  • Node: run_browser_agent - Executes the agent and returns outputs
  • Inputs: Parameters and API configuration
  • Outputs: Session recordings and metadata

Adding New Features

  1. New Action Types: Extend the action handlers in browser_agent.py
  2. New Analysis Metrics: Modify the analyzer prompts in browser_analyzer.py
  3. New Data Sources: Add datasets to conf/base/catalog.yml

๐Ÿ“Š Data Flow

  1. Input: User provides task description and target URL
  2. Agent Initialization: Creates browser context and loads AI models
  3. Task Execution:
    • LLM Mode: AI analyzes page โ†’ generates actions โ†’ executes โ†’ repeats
    • Script Mode: Follows predefined action sequence
  4. Recording: All DOM events captured via rrweb
  5. Analysis: AI evaluates performance against original task
  6. Output:
    • .vbrec file containing session recording
    • .json file containing metadata and analysis

๐Ÿ”Œ API Endpoints

The FastAPI server provides these REST endpoints:

  • GET /: Dashboard homepage with task submission form
  • POST /start-agent: Start a new agent task
    {
      "task": "Navigate to the login page and sign in",
      "url": "https://example.com",
      "mode": "llm",
      "headless": false
    }
  • GET /recordings: List all recordings with pagination and metadata
  • POST /replay: Replay a specific recording in browser
  • GET /download/{filename}: Download recording and metadata as ZIP
  • GET /test-scripts: List available test scripts
  • GET /test-scripts/{filename}: Retrieve specific test script

๐Ÿณ Environment Variables Reference

# Kedro Configuration
KEDRO_DISABLE_TELEMETRY=true
DO_NOT_TRACK=1

# Azure OpenAI Service
AZURE_OPENAI_API_KEY=your_api_key
AZURE_OPENAI_RESOURCE_NAME=your_resource_name
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4  # or your deployment
AZURE_OPENAI_API_VERSION=2024-08-01-preview

๐Ÿšจ Troubleshooting

Installation Issues

ModuleNotFoundError: No module named 'synthetic_data_agent'

# Solution: Install in development mode
pip install -e . --break-system-packages --ignore-requires-python

Playwright browsers not found

# Solution: Install browser binaries
npx playwright install

kedro-datasets not found

# Solution: Install the datasets package
pip install kedro-datasets

Configuration Issues

Interpolation key 'AZURE_OPENAI_API_KEY' not found

  • Check that .env file exists in project root
  • Verify environment variables don't have quotes around values
  • Ensure python-dotenv is installed

Pipeline input 'credentials:azure_openai' not found

  • Verify conf/local/credentials.yml exists and has correct structure
  • Check that environment variables are properly loaded

Runtime Issues

Agent gets stuck or fails

  • Try running in non-headless mode: headless: false
  • Reduce maxRetries for faster debugging
  • Check console for errors

Recording playback fails

  • Ensure HTML templates are not being ignored by Git
  • Verify templates/replayer.html exists and is properly formatted
  • Check console for rrweb-player loading errors

๐Ÿ“š Additional Resources

Documentation Links

Project References

  • Configuration Guide: See conf/README.md for detailed setup instructions
  • API Reference: All endpoints documented with OpenAPI at /docs
  • Data Catalog: Detailed dataset definitions in conf/base/catalog.yml

๐Ÿ“ License

[Add your license information here]

๐Ÿ†˜ Support

For issues and questions:

  1. Check troubleshooting section above for common problems
  2. Review configuration files in conf/ directory
  3. Enable debug logging for detailed error information
  4. Open an issue in the repository with error logs and steps to reproduce

โš ๏ธ Important Notes:

  • Environment variables in .env should not have quotes
  • HTML templates need Git ignore exceptions to be tracked
  • Agent requires internet access for AI API calls

Then open http://localhost:8000 in your browser to access the dashboard where you can:

  • Submit new agent tasks
  • View and download recordings
  • Replay browser sessions
  • Manage test scripts

Note: This project is actively developed. Make sure to run pip install -e . --break-system-packages after any changes to the source code to ensure imports work correctly.

About

A Kedro-based AI browser automation project that generates synthetic data through intelligent web interactions. This project combines AI agents with web browser automation to create realistic test data and perform automated tasks on websites.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published