PDD-CLI Bug: Uses Text Matching Instead of Semantic Selectors

## PDD-CLI Bug: Uses Text Matching Instead of Semantic Selectors

PDD-CLI generates E2E tests using `getByText()` for accessibility elements that should use semantic selectors like `getByRole()`. This creates less robust, less accessible tests.

**Why this matters:** Tests don't validate accessibility structure and are more fragile than semantic selectors.

## Concrete Example

For a dashboard page with headings:

```typescript
// PDD generated test (WRONG):
test('displays dashboard sections', async ({ page }) => {
  await page.goto('/dashboard');
  
  // Uses text matching
  await expect(page.getByText('Analytics')).toBeVisible();
  await expect(page.getByText('Recent Activity')).toBeVisible();
});
```

Better approach using semantic selectors:

```typescript
// Should generate (CORRECT):
test('displays dashboard sections', async ({ page }) => {
  await page.goto('/dashboard');
  
  // Uses semantic role-based selectors
  await expect(page.getByRole('heading', { name: 'Analytics' })).toBeVisible();
  await expect(page.getByRole('heading', { name: 'Recent Activity' })).toBeVisible();
});
```

**What went wrong:** PDD used `getByText()` which matches any text content. Using `getByRole('heading')` would:
1. Verify the element is actually a heading (better accessibility validation)
2. Be more specific (avoid matching "Analytics" in paragraph text)
3. Follow Playwright best practices

**Impact:** Tests work but are less robust and don't validate semantic HTML structure.

## Why PDD Makes This Mistake

PDD-CLI currently:
- Defaults to text matching as simplest approach
- Doesn't analyze DOM structure to find semantic roles
- Doesn't prioritize accessibility testing

But it should:
1. Use semantic selectors first (role, label, placeholder)
2. Reserve text matching for non-semantic content
3. Validate accessibility structure through selectors

## How to Prevent This in PDD-CLI

**What PDD should do differently:**

1. **Prefer semantic selectors hierarchy:**
   - `getByRole()` - for headings, buttons, links, forms
   - `getByLabel()` - for form inputs
   - `getByPlaceholder()` - for inputs with placeholders
   - `getByTestId()` - for non-semantic elements
   - `getByText()` - last resort only

2. **Parse component DOM structure:** Identify heading levels, button roles, form structure.

3. **Generate accessibility-validating tests:** Tests should enforce semantic HTML.

**Example improvement:**
```
Current: See "Analytics" text → generate getByText('Analytics')

Improved: See "Analytics" text → parse DOM structure:
        → <h2>Analytics</h2> found
        → Generate: getByRole('heading', { name: 'Analytics' })
        → Validates both text AND semantic structure
```

## Severity

**P3 - Low Priority**

- **Frequency:** High - common in all E2E tests
- **Impact:** Tests less robust, accessibility not validated
- **Detectability:** Low - tests pass but could be better
- **Prevention cost:** Low - selector hierarchy is well-defined

## Category

`test-generation`

## Related Issues

- #416 - Exact string matches (related text matching issue)
- #419 - CSS class selectors (another non-semantic selector problem)
- #421 - Component library structure assumptions (selector strategy issue)

---

**For Contributors:** Discovered throughout `frontend/e2e/crm.spec.ts`, improved in commit `34a651d5` to use semantic selectors.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PDD-CLI Bug: Uses Text Matching Instead of Semantic Selectors #573