Integrate Co-reference Detection for Pronoun Substitution in PII Masking

# Integrate Co-reference Detection for Pronoun Substitution in PII Masking

## Background

The Yaak proxy currently detects and masks PII entities (names, emails, etc.) to protect user privacy when sending requests to external LLMs. However, it doesn't handle **pronouns** that refer to these masked entities, leading to gender/reference mismatches in responses.

### The Problem

When a male name like "Tom Miller" is replaced with a female name like "Sarah Smith", associated pronouns ("he", "him", "his") remain unchanged, causing grammatical inconsistencies and confusion:

**Current Behavior (Broken):**

```
User Input: "Hi, his name is Tom Miller. Write a short biography about him."
↓ PII Detection: FIRSTNAME: Tom, SURNAME: Miller
Masked Request: "Hi, his name is Sarah Smith. Write a short biography about him."
                     ^^^                                                  ^^^
LLM Response: "Sarah Smith is a software engineer. She is a co-founder..."
                                                    ^^^
Unmasked Response: "Tom Miller is a software engineer. She is a co-founder..."
                                                       ^^^
                                        ❌ Gender mismatch!
```

**Ideal Behavior (Fixed):**

```
User Input: "Hi, his name is Tom Miller. Write a short biography about him."
↓ PII Detection + Co-reference: Tom Miller (cluster 1) ← "his", "him" also in cluster 1
Masked Request: "Hi, her name is Sarah Smith. Write a short biography about her."
                     ^^^                                                   ^^^
LLM Response: "Sarah Smith is a software engineer. She is a co-founder..."
                                                    ^^^
Unmasked Response: "Tom Miller is a software engineer. He is a co-founder..."
                                                       ^^
                                        ✅ Consistent pronouns!
```

### What is Co-reference Detection?

Co-reference detection identifies and links mentions of the same entity across a text:
- "Tom Miller", "he", "him", "his" all refer to the same person
- "Sarah Smith", "she", "her" all refer to the same person

The Yaak model **already performs co-reference detection** and outputs cluster IDs for each token. We just need to use this information to handle pronoun substitution.

---

## Current Model Capabilities

The multi-task PII detection model outputs co-reference predictions:

```python
# From eval_model_detailed.py:298-300
coref_logits = outputs["coref_logits"][0]  # [seq_len, num_coref_labels]
coref_predictions = torch.argmax(coref_logits, dim=-1)  # [seq_len]
coref_pred_ids = [p.item() for p in coref_predictions]  # Cluster IDs per token
```

**Example output:**
```
Text:    "Tom Miller went to his car. He drove home."
Tokens:  Tom Miller went to his car . He drove home .
Cluster: 1   1     0   0  1   0   0 1  0     0    0
         └───┘              └─┘       └─┘
         Same entity (cluster 1)
```

The model correctly identifies that "Tom Miller", "his", and "He" all refer to the same entity (cluster 1).

---

## Implementation Plan

### Phase 1: Add Pronoun Mapping Module

**File**: `src/backend/pii/pronoun_mapper.go` (new)

Create a pronoun mapping service that handles gender-aware pronoun substitution:

```go
package pii

import (
    "strings"
)

// PronounGender represents grammatical gender for pronouns
type PronounGender int

const (
    GenderUnknown PronounGender = iota
    GenderMale
    GenderFemale
    GenderNeutral
)

// PronounMapper handles pronoun substitution based on gender
type PronounMapper struct {
    pronounMap map[string]map[PronounGender]string
}

// NewPronounMapper creates a new pronoun mapper
func NewPronounMapper() *PronounMapper {
    return &PronounMapper{
        pronounMap: initPronounMap(),
    }
}

// initPronounMap initializes the pronoun mapping table
func initPronounMap() map[string]map[PronounGender]string {
    return map[string]map[PronounGender]string{
        // Subject pronouns
        "he": {
            GenderMale:    "he",
            GenderFemale:  "she",
            GenderNeutral: "they",
        },
        "she": {
            GenderMale:    "he",
            GenderFemale:  "she",
            GenderNeutral: "they",
        },
        
        // Object pronouns
        "him": {
            GenderMale:    "him",
            GenderFemale:  "her",
            GenderNeutral: "them",
        },
        "her": {
            GenderMale:    "him",
            GenderFemale:  "her",
            GenderNeutral: "them",
        },
        
        // Possessive pronouns
        "his": {
            GenderMale:    "his",
            GenderFemale:  "her",
            GenderNeutral: "their",
        },
        
        // Reflexive pronouns
        "himself": {
            GenderMale:    "himself",
            GenderFemale:  "herself",
            GenderNeutral: "themselves",
        },
        "herself": {
            GenderMale:    "himself",
            GenderFemale:  "herself",
            GenderNeutral: "themselves",
        },
    }
}

// MapPronoun converts a pronoun from one gender to another
func (pm *PronounMapper) MapPronoun(pronoun string, fromGender, toGender PronounGender) string {
    lowerPronoun := strings.ToLower(pronoun)
    
    // Check if we have a mapping for this pronoun
    if genderMap, exists := pm.pronounMap[lowerPronoun]; exists {
        if mapped, ok := genderMap[toGender]; ok {
            // Preserve original capitalization
            if isCapitalized(pronoun) {
                return capitalize(mapped)
            }
            return mapped
        }
    }
    
    // If no mapping found, return original
    return pronoun
}

// DetectGenderFromName attempts to detect gender from a first name
func (pm *PronounMapper) DetectGenderFromName(name string) PronounGender {
    // Common male names
    maleNames := []string{"tom", "john", "james", "michael", "david", "robert"}
    // Common female names
    femaleNames := []string{"sarah", "emma", "lisa", "jennifer", "mary", "patricia"}
    
    lowerName := strings.ToLower(name)
    
    for _, male := range maleNames {
        if strings.Contains(lowerName, male) {
            return GenderMale
        }
    }
    
    for _, female := range femaleNames {
        if strings.Contains(lowerName, female) {
            return GenderFemale
        }
    }
    
    return GenderUnknown
}

// Helper functions
func isCapitalized(s string) bool {
    if len(s) == 0 {
        return false
    }
    return s[0] >= 'A' && s[0] <= 'Z'
}

func capitalize(s string) string {
    if len(s) == 0 {
        return s
    }
    return strings.ToUpper(string(s[0])) + s[1:]
}
```

### Phase 2: Extend Detector Output with Co-reference Information

**File**: `src/backend/pii/detectors/types.go`

Add co-reference cluster information to detector output:

```go
// Entity represents a detected PII entity
type Entity struct {
    Text      string
    Label     string
    StartPos  int
    EndPos    int
    ClusterID int    // NEW: Co-reference cluster ID (0 = no cluster)
}

// DetectorOutput represents the result of PII detection
type DetectorOutput struct {
    Entities          []Entity
    CorefClusters     map[int][]EntityMention  // NEW: Cluster ID → mentions
    InferenceTimeMs   float64
}

// EntityMention represents a single mention in a co-reference cluster
type EntityMention struct {
    Text     string
    StartPos int
    EndPos   int
    IsEntity bool    // true if this is a PII entity, false if pronoun
}
```

### Phase 3: Update Model Detector to Extract Co-references

**File**: `src/backend/pii/detectors/model_detector.go`

Modify the model detector to extract co-reference information from model output:

```go
// In the Detect method, after getting PII predictions:

// Extract co-reference clusters
corefClusters := make(map[int][]EntityMention)

for i, token := range tokens {
    clusterID := corefPredictions[i]
    
    if clusterID > 0 {  // Skip cluster 0 (no cluster)
        mention := EntityMention{
            Text:     token,
            StartPos: tokenOffsets[i].Start,
            EndPos:   tokenOffsets[i].End,
            IsEntity: isPIIEntity(tokens[i], piiPredictions[i]),
        }
        
        corefClusters[clusterID] = append(corefClusters[clusterID], mention)
    }
}

// Set cluster IDs on entities
for i := range entities {
    entities[i].ClusterID = findClusterForEntity(entities[i], corefClusters)
}

return DetectorOutput{
    Entities:      entities,
    CorefClusters: corefClusters,
}, nil
```

### Phase 4: Update Masking Service with Pronoun Substitution

**File**: `src/backend/pii/masking_service.go`

Extend the masking service to handle pronoun substitution:

```go
type MaskingService struct {
    detector       detectors.Detector
    generator      *GeneratorService
    pronounMapper  *PronounMapper  // NEW
}

func NewMaskingService(detector detectors.Detector, generator *GeneratorService) *MaskingService {
    return &MaskingService{
        detector:      detector,
        generator:     generator,
        pronounMapper: NewPronounMapper(),  // NEW
    }
}

func (s *MaskingService) MaskText(text string, logPrefix string) MaskedResult {
    piiFound, err := s.detector.Detect(context.Background(), detectors.DetectorInput{Text: text})
    // ... existing PII detection code ...
    
    // NEW: Handle pronoun substitution
    genderMappings := make(map[int]struct{
        OriginalGender PronounGender
        MaskedGender   PronounGender
    })
    
    // Determine gender change for each cluster
    for clusterID, mentions := range piiFound.CorefClusters {
        originalGender := s.detectClusterGender(mentions, entities)
        maskedGender := s.detectMaskedGender(mentions, entities, maskedToOriginal)
        
        genderMappings[clusterID] = struct{
            OriginalGender PronounGender
            MaskedGender   PronounGender
        }{
            OriginalGender: originalGender,
            MaskedGender:   maskedGender,
        }
    }
    
    // Replace pronouns in clusters that have gender changes
    for clusterID, genderMap := range genderMappings {
        if genderMap.OriginalGender != genderMap.MaskedGender {
            maskedText = s.replaceClusterPronouns(
                maskedText,
                piiFound.CorefClusters[clusterID],
                genderMap.OriginalGender,
                genderMap.MaskedGender,
            )
        }
    }
    
    return MaskedResult{
        MaskedText:       maskedText,
        MaskedToOriginal: maskedToOriginal,
        Entities:         entities,
        GenderMappings:   genderMappings,  // Store for restoration
    }
}

func (s *MaskingService) RestorePII(text string, result MaskedResult) string {
    // Restore PII entities
    restoredText := text
    for maskedText, originalText := range result.MaskedToOriginal {
        restoredText = strings.ReplaceAll(restoredText, maskedText, originalText)
    }
    
    // NEW: Reverse pronoun substitutions
    for clusterID, genderMap := range result.GenderMappings {
        if genderMap.OriginalGender != genderMap.MaskedGender {
            // Reverse: masked → original gender
            restoredText = s.reverseClusterPronouns(
                restoredText,
                clusterID,
                genderMap.MaskedGender,
                genderMap.OriginalGender,
            )
        }
    }
    
    return restoredText
}
```

### Phase 5: Add Configuration and Testing

**File**: `src/backend/config/config.go`

Add configuration option to enable/disable pronoun substitution:

```go
type Config struct {
    // ... existing fields ...
    EnablePronounSubstitution bool `json:"enable_pronoun_substitution"`
}
```

**File**: `src/backend/pii/detectors/model_detector_test.go`

Add comprehensive tests:

```go
func TestCorefPronounSubstitution(t *testing.T) {
    tests := []struct {
        name           string
        input          string
        expectedMasked string
        expectedRestored string
    }{
        {
            name:  "male to female name change",
            input: "Tom Miller went to his car. He drove home.",
            expectedMasked: "Sarah Smith went to her car. She drove home.",
            expectedRestored: "Tom Miller went to his car. He drove home.",
        },
        {
            name:  "female to male name change",
            input: "Sarah went to her office. She worked late.",
            expectedMasked: "John went to his office. He worked late.",
            expectedRestored: "Sarah went to her office. She worked late.",
        },
        {
            name:  "reflexive pronouns",
            input: "John introduced himself to the team.",
            expectedMasked: "Mary introduced herself to the team.",
            expectedRestored: "John introduced himself to the team.",
        },
    }
    
    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            // Test implementation
        })
    }
}
```

---

## Integration Points

### 1. Model Inference
The model already outputs co-reference predictions. We need to:
- Extract `coref_logits` from model output (already done in Python)
- Pass cluster information to Go backend via detector interface
- Map tokens to cluster IDs

### 2. PII Masking
When masking PII:
1. Detect PII entities and their cluster IDs
2. Identify pronouns in the same cluster
3. Determine gender change (male→female, female→male, etc.)
4. Replace pronouns with appropriate forms

### 3. PII Restoration
When restoring original PII:
1. Restore masked entities (existing functionality)
2. Reverse pronoun substitutions using stored gender mappings
3. Ensure pronouns match original text

---

## Example Scenarios

### Scenario 1: Biography Request
```
Input:     "His name is Tom Miller. Write about him."
Masked:    "Her name is Sarah Smith. Write about her."
LLM Out:   "Sarah Smith is an engineer. She graduated..."
Restored:  "Tom Miller is an engineer. He graduated..."
```

### Scenario 2: Multiple Entities
```
Input:     "Tom met Sarah. He thanked her for the help."
Masked:    "Lisa met John. She thanked him for the help."
LLM Out:   "Lisa met John. She thanked him warmly."
Restored:  "Tom met Sarah. He thanked her warmly."
```

### Scenario 3: Reflexive Pronouns
```
Input:     "Tom introduced himself to the CEO."
Masked:    "Emma introduced herself to the CEO."
LLM Out:   "Emma introduced herself professionally."
Restored:  "Tom introduced himself professionally."
```

---

## Success Criteria

- [x] `PronounMapper` module created with gender mapping tables
- [x] Pronoun mapping supports:
  - [ ] Subject pronouns (he/she/they)
  - [ ] Object pronouns (him/her/them)
  - [ ] Possessive pronouns (his/her/their)
  - [ ] Reflexive pronouns (himself/herself/themselves)
- [x] Detector output includes co-reference cluster information
- [x] Model detector extracts and passes cluster IDs
- [x] Masking service handles pronoun substitution
- [x] Restoration service reverses pronoun changes
- [x] Configuration option to enable/disable feature
- [x] Comprehensive test coverage (10+ test cases)
- [ ] Gender detection works for common names
- [ ] Capitalization preserved in pronoun substitution
- [x] Integration with existing proxy flow
- [x] Documentation updated

---

## Technical Challenges

### 1. Gender Detection
- **Challenge**: Determining gender from masked names
- **Solution**: Use name-based heuristics + fallback to neutral pronouns

### 2. Pronoun Ambiguity
- **Challenge**: Words like "her" can be possessive or object
- **Solution**: Context-aware mapping based on surrounding words

### 3. Multiple Entities
- **Challenge**: Handling multiple entities with different genders
- **Solution**: Track each cluster separately with independent gender mappings

### 4. Cross-sentence References
- **Challenge**: Pronouns may refer to entities in previous sentences
- **Solution**: Use co-reference clusters that span entire text

---

## Future Enhancements

1. **Advanced Gender Detection**: Use external name-gender databases
2. **Neutral Pronoun Support**: Better handling of they/them pronouns
3. **Language Support**: Extend to other languages beyond English
4. **LLM-based Detection**: Use LLM to determine appropriate pronouns
5. **User Preferences**: Allow users to specify gender preferences

---

## References

- [Co-reference Resolution](https://en.wikipedia.org/wiki/Coreference)
- [English Pronouns](https://www.grammarly.com/blog/pronouns/)
- Model implementation: `model/src/eval_model.py`
- Co-reference detection: `model/src/eval_model_detailed.py:298-308`

---

## Notes

This feature significantly improves the quality of PII-protected LLM interactions by maintaining grammatical consistency. The co-reference detection model is already trained and functional - we just need to leverage its output in the masking/restoration pipeline.

**Complexity**: Medium  
**Impact**: High (better user experience, more natural responses)  
**Dependencies**: Requires model co-reference output (already available)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate Co-reference Detection for Pronoun Substitution in PII Masking #34

Integrate Co-reference Detection for Pronoun Substitution in PII Masking

Background

The Problem

What is Co-reference Detection?

Current Model Capabilities

Implementation Plan

Phase 1: Add Pronoun Mapping Module

Phase 2: Extend Detector Output with Co-reference Information

Phase 3: Update Model Detector to Extract Co-references

Phase 4: Update Masking Service with Pronoun Substitution

Phase 5: Add Configuration and Testing

Integration Points

1. Model Inference

2. PII Masking

3. PII Restoration

Example Scenarios

Scenario 1: Biography Request

Scenario 2: Multiple Entities

Scenario 3: Reflexive Pronouns

Success Criteria

Technical Challenges

1. Gender Detection

2. Pronoun Ambiguity

3. Multiple Entities

4. Cross-sentence References

Future Enhancements

References

Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Integrate Co-reference Detection for Pronoun Substitution in PII Masking #34

Description

Integrate Co-reference Detection for Pronoun Substitution in PII Masking

Background

The Problem

What is Co-reference Detection?

Current Model Capabilities

Implementation Plan

Phase 1: Add Pronoun Mapping Module

Phase 2: Extend Detector Output with Co-reference Information

Phase 3: Update Model Detector to Extract Co-references

Phase 4: Update Masking Service with Pronoun Substitution

Phase 5: Add Configuration and Testing

Integration Points

1. Model Inference

2. PII Masking

3. PII Restoration

Example Scenarios

Scenario 1: Biography Request

Scenario 2: Multiple Entities

Scenario 3: Reflexive Pronouns

Success Criteria

Technical Challenges

1. Gender Detection

2. Pronoun Ambiguity

3. Multiple Entities

4. Cross-sentence References

Future Enhancements

References

Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions