Skip to content

Critical: Obfuscation Not Applied in run_experiment() - Issue #79 Results May Be Tainted #81

@iAmGiG

Description

@iAmGiG

Critical Bug: Obfuscation Not Working in Pattern Validation

Problem Summary

The Issue #79 pattern taxonomy validation claimed to use obfuscation testing, but the run_experiment() method does not obfuscate dates before sending them to the LLM. This means the LLM received real dates like "2024-01-02" instead of anonymized dates like "Day T+0".

Evidence

1. Validation Script Calls Non-Obfuscated Method

File: scripts/validation/validate_pattern_taxonomy.py:157

result = self.agent.run_experiment(
    experiment_description=experiment_desc,  # Contains real date!
    date=date_str  # Real date: "2024-01-02"
)

2. Experiment Description Contains Real Dates

File: scripts/validation/validate_pattern_taxonomy.py:276-281

'0dte_hedging': (
    f"Analyze {self.symbol} 0DTE option hedging flows on {date_str}."
    # ^^^ This is "2024-01-02", not obfuscated!
)

3. LLM Planning Prompt Exposes Date

File: src/agents/market_mechanics_agent.py:494

planning_prompt = f"""
EXPERIMENT REQUEST: {experiment_description}
DATE: {date}  # ← LLM sees real date here!

4. Analysis Prompt Also Contains Real Date

File: src/agents/market_mechanics_agent.py:652

analysis_prompt = f"""
You are analyzing market data for this experiment: {experiment_description}
# ^^^ Contains "Analyze SPY on 2024-01-02" with real date

5. Output Files Confirm No Obfuscation

File: reports/validation/pattern_taxonomy/0dte_hedging_SPY_2024Q1.yaml

- date: '2024-01-02'
  date_obfuscated: '2024-01-02'  # Should be "Day T+0" if obfuscated!

Where Obfuscation DOES Work

The run_batch_experiments() method (line 295-387) does obfuscate properly:

obfuscator = DataObfuscator() if use_obfuscation else None
date_mapping = obfuscator.obfuscate_dates(dates)

But the validation script used run_experiment() (singular), not run_batch_experiments() (plural).

Impact Assessment

Severity: CRITICAL

Mitigation Factors

  1. Training Cutoff: o3-mini likely trained on data through Oct 2023, so couldn't know specific 2024 events
  2. Pattern Consistency: 53/53 detection rate suggests real mechanics, not date memorization
  3. Out-of-Sample Period: Q1 2024 was after training cutoff

Cannot Claim

  • ❌ "Obfuscation testing proves patterns work without temporal context"
  • ❌ "LLM detected patterns blind to dates/events"

Can Still Claim (With Caveats)

  • ✅ "LLM detected patterns on out-of-sample data (post-training cutoff)"
  • ✅ "Consistent detection across 53 dates suggests robust pattern recognition"
  • ✅ "Pattern mechanics align with academic literature (Buis et al. 2024, Jeannin et al. 2008)"

Fix Required

Option 1: Add Obfuscation to run_experiment()

def run_experiment(self, experiment_description: str, date: str, obfuscate: bool = True) -> Dict:
    # Obfuscate date before LLM calls
    if obfuscate:
        obfuscator = DataObfuscator()
        date_mapping = obfuscator.obfuscate_dates([date])
        obfuscated_date = date_mapping[date]
        # Replace date in experiment_description
        experiment_description = experiment_description.replace(date, obfuscated_date)
        date_for_llm = obfuscated_date
    else:
        date_for_llm = date
    
    # Use date_for_llm in all LLM prompts
    tool_plan = self._plan_experiment_tools(experiment_description, date_for_llm)
    ...

Option 2: Use run_batch_experiments() for Validation

Modify validator to use the already-working batch method:

# Instead of loop calling run_experiment()
batch_result = self.agent.run_batch_experiments(
    dates=dates,
    experiment_template=pattern_experiment_template,
    use_obfuscation=True  # This works correctly!
)

Action Items

  • Immediate: Fix run_experiment() to support obfuscation
  • Validation: Re-run Issue Pattern Taxonomy: Focus on Core Mechanical Patterns #79 tests with proper obfuscation
  • Comparison: Compare new results vs. old results (expect similar if patterns are real)
  • Documentation: Update CLAUDE.md to reflect corrected methodology
  • Reports: Regenerate all reports/validation/pattern_taxonomy/*.yaml files
  • Integrity Check: If results change significantly, investigate why

Files to Modify

  1. src/agents/market_mechanics_agent.py - Add obfuscation to run_experiment()
  2. scripts/validation/validate_pattern_taxonomy.py - Use obfuscated dates
  3. docs/guides/data-obfuscation.md - Document correct usage
  4. CLAUDE.md - Update Issue Pattern Taxonomy: Focus on Core Mechanical Patterns #79 status

Timeline

Priority: HIGH - Needed before advisor presentation
Estimated Fix: 2-4 hours
Re-validation: 1-2 hours (53 dates with cache)

Related Issues

Sub-issues

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinghigh-priorityHigh priority issues requiring immediate attention

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions