CRITICAL: Data leakage in scaler fitting

## Problem

The scaler is being fit on both training AND validation data, which is a form of data leakage.

**Affected files:**
- `anomaly-detection/train_og.py:29`
- `anomaly-detection/test.py:40`

**Current code:**
```python
scaler.fit(x_train.append(x_opt))
```

**Issue:** The scaler learns mean/std statistics from validation data (`x_opt`) that it shouldn't have access to during training. This inflates accuracy metrics.

**Correct approach:**
```python
scaler.fit(x_train)  # Only fit on training data
```

## Impact

This is likely the cause of the suspected overtraining. The reported 99.98% accuracy may be artificially inflated.

## Priority

CRITICAL - This affects the validity of published results.

## References

- Archive branch: Lines identified in code review
- See: RETROSPECTIVE.md for context

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CRITICAL: Data leakage in scaler fitting #13

Problem

Impact

Priority

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

CRITICAL: Data leakage in scaler fitting #13

Description

Problem

Impact

Priority

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions