PDD-CLI Bug: Generates Test Data Without Proper CSV Escaping

## PDD-CLI Bug: Generates Test Data Without Proper CSV Escaping

PDD-CLI generates CSV test data with unquoted fields containing commas, breaking CSV parsing. When test data includes comma-separated values (like tags or labels), PDD doesn't quote the fields.

**Why this matters:** Test data fails to parse, causing CSV parsing errors and test failures.

## Concrete Example

For a test that creates GitHub issues with labels:

```python
# PDD generated test data (WRONG):
# test_data.csv
email,name,labels
user1@example.com,John Doe,attendee,vip
user2@example.com,Jane Smith,speaker,sponsor
```

CSV parser reads this as:

```python
# Row 1 has 5 fields instead of 3!
['user1@example.com', 'John Doe', 'attendee', 'vip']  # ← Extra fields!
```

Correct format:

```python
# Should generate (CORRECT):
# test_data.csv
email,name,labels
user1@example.com,John Doe,"attendee,vip"
user2@example.com,Jane Smith,"speaker,sponsor"
```

**What went wrong:** PDD generated labels as `attendee,vip` without quotes. The CSV parser treats the comma as a field delimiter, splitting into 5 fields instead of 3.

**Impact:** `csv.DictReader` throws error or creates malformed records with extra fields.

## Why PDD Makes This Mistake

PDD-CLI currently:
- Generates CSV as plain text
- Doesn't quote fields containing special characters
- Doesn't use proper CSV writing libraries

But it should:
1. Use `csv.DictWriter` or equivalent to handle escaping
2. Always quote fields containing commas, quotes, or newlines
3. Follow RFC 4180 CSV spec

## How to Prevent This in PDD-CLI

**What PDD should do differently:**

1. **Use CSV libraries for generation:**
   ```python
   import csv
   
   with open('test_data.csv', 'w', newline='') as f:
       writer = csv.DictWriter(f, fieldnames=['email', 'name', 'labels'])
       writer.writeheader()
       writer.writerow({
           'email': 'user1@example.com',
           'name': 'John Doe',
           'labels': 'attendee,vip'  # Library handles quoting
       })
   ```

2. **Manual generation - always quote fields with commas:**
   ```
   email,name,labels
   user1@example.com,John Doe,"attendee,vip"
   ```

3. **Validate generated CSV:** Parse it back to ensure it works.

**Example improvement:**
```
Current: Generate CSV as string concatenation
       → labels = "attendee,vip" (no quotes)
       → CSV broken (4 fields instead of 3)

Improved: Generate CSV using csv.DictWriter
        → Automatic quoting for fields with commas
        → Valid CSV produced
```

## Severity

**P2 - Medium Priority**

- **Frequency:** Low - only affects CSV test data generation
- **Impact:** Test data parsing failures
- **Detectability:** High - immediate CSV parsing errors
- **Prevention cost:** Low - use CSV libraries

## Category

`test-environment`

## Related Issues

- #422 - Module-level imports (different test environment issue)
- #423 - Async data loading waits (different test issue)

---

**For Contributors:** Discovered in `backend/tests/test_crm_github.py` where GitHub issue labels CSV was malformed, fixed in commit `34a651d5`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PDD-CLI Bug: Generates Test Data Without Proper CSV Escaping #577