Skip to content

YLuchaninov/DIM-Oracle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

# DIM Oracle JS

**DOM Interaction Module "Oracle" for Playwright (JSON Input Version)**

[![License: Restricted](https://img.shields.io/badge/License-Restricted-red.svg)](LICENSE)

## Overview

DIM Oracle JS is an advanced library designed to analyze **pre-serialized DOM snapshots (JSON)** and generate detailed interaction plans for browser automation tools like Playwright. It takes a semantic "Task Intent" (what you want to achieve) and a detailed JSON representation of the current DOM state (including computed styles, visibility, interactivity flags generated by a separate serializer script) as input.

Based on this input, Oracle determines the best target element(s) and constructs a multi-step "Interaction Plan". This plan uses **unique node IDs** (from the input JSON) to reference target elements. The caller (your Playwright script or "Execution Engine") is then responsible for:

1.  Mapping these node IDs back to actual Playwright locators using the `locatorsMap` (also generated by the serializer script).
2.  Executing the steps in the plan using Playwright actions.

The core philosophy remains **"DOM First"**, leveraging the rich information (structure, semantics, *pre-calculated state*) in the serialized DOM to make robust decisions, minimizing reliance on brittle selectors or visual analysis.

**Note:** This version **requires** you to first run a compatible DOM serializer script (like the `domSerializer.js` discussed) within the browser context (`page.evaluate()` or similar) to generate the `serializedDomRoot` (JSON tree) and `locatorsMap` inputs for `DimOracle`.

## Key Features

*   **Intent-Based Interaction:** Define *what* you want to do (e.g., "click login button", "select country 'Canada'") instead of *how* (e.g., specific CSS selector).
*   **Analyzes Rich JSON DOM:** Works with detailed DOM snapshots including computed state (visibility, interactivity, role, etc.).
*   **Robust Element Identification:** Uses heuristics, context analysis, and scoring on the JSON data to find the best target node ID.
*   **Widget Recognition:** Identifies known widget patterns within the JSON structure using configurable rules.
*   **Fallback Strategies:** Provides reasonable fallback interaction plans when specific widget patterns aren't recognized.
*   **Detailed Interaction Plans (using Node IDs):** Generates multi-step plans targeting elements via their unique IDs from the JSON snapshot. The caller maps these IDs to Playwright locators. Includes support for targeting sub-elements using `subSelector`.
*   **Configurable:** Heuristics and widget patterns can be customized and extended.
*   **TypeScript First:** Provides strong typing, usable in standard JavaScript projects.
*   **Playwright-Agnostic Core:** The core analysis logic no longer directly depends on a Playwright `Page` object.

## Installation

```bash
npm install <path-to-dim-oracle-js-package-or-published-name>
# or
yarn add <path-to-dim-oracle-js-package-or-published-name>

# Ensure you have necessary peer dependencies
npm install playwright jsdom uuid @types/jsdom @types/uuid
# or
yarn add playwright jsdom uuid @types/jsdom @types/uuid

(Replace <path-to-dim-oracle-js-package-or-published-name> with the actual package name if published, or the local file path, e.g., file:./path/to/dim-oracle-js)

You also need the domSerializer.js script (or a compatible one) available to run inside the browser.

Usage

Core Workflow:

  1. Navigate: Use Playwright to go to the target page (and potentially iframe).
  2. Serialize DOM: Execute your domSerializer.js script within the target page or frame's context using page.evaluate() or frame.evaluate(). This script must return an object like { dom: serializedDomRoot, locators: locatorsMap }.
  3. Instantiate DIM Oracle: Create an instance const dim = new DimOracle(config);.
  4. Define Task Intent: Create a TaskIntent object describing the desired action and target.
  5. Execute Task: Call const result = dim.executeTask(taskIntent, serializedDomRoot, locatorsMap); (This is synchronous).
  6. Handle Result: Check result.status. If failure, inspect result.error. If success, get the interaction_plan.
  7. Execute Plan: Use your custom "Execution Engine" function (like executeDimPlan in examples) to:
    • Iterate through the interaction_plan steps.
    • For each step requiring an element target (step.nodeId exists):
      • Look up locatorsMap[step.nodeId] to get locator strategies.
      • Choose the best Playwright selector string (e.g., using chooseBestSelector helper).
      • Create the base Playwright Locator using page.locator(chosenSelector) or frameLocator.locator(chosenSelector).
      • If step.subSelector exists, chain it: baseLocator.locator(step.subSelector). This becomes the final target locator.
    • Execute the Playwright action specified in step.action on the final target locator (or page/frame for page-level actions).
    • Handle step conditions (step.condition), waits, and context storage (step.store_result_as).

Conceptual Example Snippet:

import { test, expect, Page, Frame } from '@playwright/test';
import { DimOracle, TaskIntent, InteractionResult, InteractionStep, SerializedNode, LocatorsMap } from 'dim-oracle-js'; // Adjust path
// Assume domSerializerContent holds your serializer script
import * as fs from 'fs';
const domSerializerContent = fs.readFileSync('./path/to/your/domSerializer.js', 'utf-8');
// Assume executeDimPlan and chooseBestSelector helpers exist
import { executeDimPlan, chooseBestSelector } from './path/to/executor';

test('Fill form using DIM Oracle and Serializer', async ({ page }) => {
    await page.goto('your-target-url');
    const targetFrameSelector = '#my-iframe'; // Example if content is in iframe

    // 1. Get Target Context (Page or Frame)
    const targetContext = targetFrameSelector ? page.frame(targetFrameSelector) : page;
    if (!targetContext) throw new Error("Target context not found");

    // 2. Serialize the DOM within the target context
    console.log("Serializing DOM...");
    // Inject script into the correct context (Frame or Page)
    await (targetContext as Page | Frame).addScriptTag({ content: domSerializerContent });
    const serializerResult = await (targetContext as Page | Frame).evaluate(() => {
        // @ts-ignore
        return window.serializeDOM(document.documentElement, { compressInvisible: false });
    });
    const { dom: serializedDomRoot, locators: locatorsMap } = serializerResult as { dom: SerializedNode, locators: LocatorsMap };
    console.log("DOM Serialized.");

    // 3. Instantiate DIM Oracle
    const dim = new DimOracle();

    // 4. Define Task Intent
    const nameIntent: TaskIntent = { /* ... */ };

    // 5. Execute Task (Synchronous)
    const result = dim.executeTask(nameIntent, serializedDomRoot, locatorsMap);

    // 6. Handle Result & Execute Plan
    if (result.status === 'success') {
        console.log("DIM Task Successful. Executing plan...");
        // Pass Page, plan, locatorsMap, and optional frameSelector to executor
        await executeDimPlan(page, result.interaction_plan, locatorsMap, targetFrameSelector);
        // Assertions...
    } else {
        console.error('DIM Task Failed:', result.error);
        test.fail(`DIM Oracle failed: ${result.error.errorCode}`);
    }
});

Core Concepts (JSON Version)

  • DOM Serializer: An external JavaScript script run in the browser to produce serializedDomRoot and locatorsMap. Prerequisite.
  • serializedDomRoot: The JSON object representing the DOM tree with pre-calculated state flags (isVisible, isInteractive, etc.) and unique node IDs.
  • locatorsMap: A dictionary mapping nodeId from serializedDomRoot to its potential Playwright locator strategies.
  • TaskIntent: Describes what needs to be done (action, semantic target, value).
  • Analysis Pipeline: Internal synchronous process analyzing the JSON to find the best matching nodeId.
  • Widget Recognition: Matches patterns within the JSON structure.
  • InteractionPlan: A list of InteractionStep objects.
  • InteractionStep.nodeId: The unique ID (from the JSON) of the primary element targeted by the step.
  • InteractionStep.subSelector: An optional Playwright selector string used to target an element within the node identified by nodeId.
  • Execution Engine: Your code (like executeDimPlan) responsible for:
    • Receiving the plan and locatorsMap.
    • Looking up nodeId in locatorsMap.
    • Choosing and creating the Playwright Locator (potentially chaining subSelector).
    • Executing the Playwright action from the step.
    • Handling conditions and context.

Running the Examples

The example files (examples/**/*.test.ts, examples/**/*.run.ts) demonstrate usage with Playwright.

Prerequisites:

  1. Node.js and npm/yarn.
  2. Build the Library: From the project root:
    npm install
    npm run build
  3. Install Playwright Browsers:
    npx playwright install --with-deps
  4. DOM Serializer Script: Ensure examples/utils/domSerializer.js exists (or update paths in examples).

Executing Tests (*.test.ts):

Use the Playwright test runner (configured via examples/playwright.config.ts):

# Run all tests in examples/
npx playwright test ./examples

# Run a specific test file
npx playwright test ./examples/simple-interaction/login.test.ts

# Run with UI Mode
npx playwright test ./examples --ui

# Run Headed
npx playwright test ./examples --headed

Executing Run Scripts (*.run.ts):

These are standalone scripts demonstrating end-to-end flows.

  1. Compile (Optional but Recommended):
    npx tsc ./examples/end-to-end/contact-form.run.ts --outDir ./dist-examples --esModuleInterop --module commonjs --target es2020 --skipLibCheck --resolveJsonModule
  2. Run Compiled Script:
    node ./dist-examples/examples/end-to-end/contact-form.run.js
  3. Run Directly with ts-node:
    npx ts-node ./examples/end-to-end/contact-form.run.ts
    (Modify HEADLESS_MODE and SLOW_MOTION in the .run.ts file to control visibility and speed.)

Configuration & Extension

  • Heuristics: Default weights are in src/config/defaults.ts. Override via DimOracle constructor config.
  • Widget Patterns: Predefined patterns are in src/config/knownPatterns.ts. Recognition functions (triggerMatcher, parts finders) must operate on the SerializedNode JSON structure. Add your own patterns via constructor config.

License

Usage of this library and its parts is strictly restricted. See the LICENSE file for details.

Contributing

Currently, contributions are not open due to the restricted license.

About

DOM Interaction Module

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published