Integrate Co-reference Detection for Pronoun Substitution in PII Masking
Background
The Yaak proxy currently detects and masks PII entities (names, emails, etc.) to protect user privacy when sending requests to external LLMs. However, it doesn't handle pronouns that refer to these masked entities, leading to gender/reference mismatches in responses.
The Problem
When a male name like "Tom Miller" is replaced with a female name like "Sarah Smith", associated pronouns ("he", "him", "his") remain unchanged, causing grammatical inconsistencies and confusion:
Current Behavior (Broken):
User Input: "Hi, his name is Tom Miller. Write a short biography about him."
↓ PII Detection: FIRSTNAME: Tom, SURNAME: Miller
Masked Request: "Hi, his name is Sarah Smith. Write a short biography about him."
^^^ ^^^
LLM Response: "Sarah Smith is a software engineer. She is a co-founder..."
^^^
Unmasked Response: "Tom Miller is a software engineer. She is a co-founder..."
^^^
❌ Gender mismatch!
Ideal Behavior (Fixed):
User Input: "Hi, his name is Tom Miller. Write a short biography about him."
↓ PII Detection + Co-reference: Tom Miller (cluster 1) ← "his", "him" also in cluster 1
Masked Request: "Hi, her name is Sarah Smith. Write a short biography about her."
^^^ ^^^
LLM Response: "Sarah Smith is a software engineer. She is a co-founder..."
^^^
Unmasked Response: "Tom Miller is a software engineer. He is a co-founder..."
^^
✅ Consistent pronouns!
What is Co-reference Detection?
Co-reference detection identifies and links mentions of the same entity across a text:
- "Tom Miller", "he", "him", "his" all refer to the same person
- "Sarah Smith", "she", "her" all refer to the same person
The Yaak model already performs co-reference detection and outputs cluster IDs for each token. We just need to use this information to handle pronoun substitution.
Current Model Capabilities
The multi-task PII detection model outputs co-reference predictions:
# From eval_model_detailed.py:298-300
coref_logits = outputs["coref_logits"][0] # [seq_len, num_coref_labels]
coref_predictions = torch.argmax(coref_logits, dim=-1) # [seq_len]
coref_pred_ids = [p.item() for p in coref_predictions] # Cluster IDs per token
Example output:
Text: "Tom Miller went to his car. He drove home."
Tokens: Tom Miller went to his car . He drove home .
Cluster: 1 1 0 0 1 0 0 1 0 0 0
└───┘ └─┘ └─┘
Same entity (cluster 1)
The model correctly identifies that "Tom Miller", "his", and "He" all refer to the same entity (cluster 1).
Implementation Plan
Phase 1: Add Pronoun Mapping Module
File: src/backend/pii/pronoun_mapper.go (new)
Create a pronoun mapping service that handles gender-aware pronoun substitution:
package pii
import (
"strings"
)
// PronounGender represents grammatical gender for pronouns
type PronounGender int
const (
GenderUnknown PronounGender = iota
GenderMale
GenderFemale
GenderNeutral
)
// PronounMapper handles pronoun substitution based on gender
type PronounMapper struct {
pronounMap map[string]map[PronounGender]string
}
// NewPronounMapper creates a new pronoun mapper
func NewPronounMapper() *PronounMapper {
return &PronounMapper{
pronounMap: initPronounMap(),
}
}
// initPronounMap initializes the pronoun mapping table
func initPronounMap() map[string]map[PronounGender]string {
return map[string]map[PronounGender]string{
// Subject pronouns
"he": {
GenderMale: "he",
GenderFemale: "she",
GenderNeutral: "they",
},
"she": {
GenderMale: "he",
GenderFemale: "she",
GenderNeutral: "they",
},
// Object pronouns
"him": {
GenderMale: "him",
GenderFemale: "her",
GenderNeutral: "them",
},
"her": {
GenderMale: "him",
GenderFemale: "her",
GenderNeutral: "them",
},
// Possessive pronouns
"his": {
GenderMale: "his",
GenderFemale: "her",
GenderNeutral: "their",
},
// Reflexive pronouns
"himself": {
GenderMale: "himself",
GenderFemale: "herself",
GenderNeutral: "themselves",
},
"herself": {
GenderMale: "himself",
GenderFemale: "herself",
GenderNeutral: "themselves",
},
}
}
// MapPronoun converts a pronoun from one gender to another
func (pm *PronounMapper) MapPronoun(pronoun string, fromGender, toGender PronounGender) string {
lowerPronoun := strings.ToLower(pronoun)
// Check if we have a mapping for this pronoun
if genderMap, exists := pm.pronounMap[lowerPronoun]; exists {
if mapped, ok := genderMap[toGender]; ok {
// Preserve original capitalization
if isCapitalized(pronoun) {
return capitalize(mapped)
}
return mapped
}
}
// If no mapping found, return original
return pronoun
}
// DetectGenderFromName attempts to detect gender from a first name
func (pm *PronounMapper) DetectGenderFromName(name string) PronounGender {
// Common male names
maleNames := []string{"tom", "john", "james", "michael", "david", "robert"}
// Common female names
femaleNames := []string{"sarah", "emma", "lisa", "jennifer", "mary", "patricia"}
lowerName := strings.ToLower(name)
for _, male := range maleNames {
if strings.Contains(lowerName, male) {
return GenderMale
}
}
for _, female := range femaleNames {
if strings.Contains(lowerName, female) {
return GenderFemale
}
}
return GenderUnknown
}
// Helper functions
func isCapitalized(s string) bool {
if len(s) == 0 {
return false
}
return s[0] >= 'A' && s[0] <= 'Z'
}
func capitalize(s string) string {
if len(s) == 0 {
return s
}
return strings.ToUpper(string(s[0])) + s[1:]
}
Phase 2: Extend Detector Output with Co-reference Information
File: src/backend/pii/detectors/types.go
Add co-reference cluster information to detector output:
// Entity represents a detected PII entity
type Entity struct {
Text string
Label string
StartPos int
EndPos int
ClusterID int // NEW: Co-reference cluster ID (0 = no cluster)
}
// DetectorOutput represents the result of PII detection
type DetectorOutput struct {
Entities []Entity
CorefClusters map[int][]EntityMention // NEW: Cluster ID → mentions
InferenceTimeMs float64
}
// EntityMention represents a single mention in a co-reference cluster
type EntityMention struct {
Text string
StartPos int
EndPos int
IsEntity bool // true if this is a PII entity, false if pronoun
}
Phase 3: Update Model Detector to Extract Co-references
File: src/backend/pii/detectors/model_detector.go
Modify the model detector to extract co-reference information from model output:
// In the Detect method, after getting PII predictions:
// Extract co-reference clusters
corefClusters := make(map[int][]EntityMention)
for i, token := range tokens {
clusterID := corefPredictions[i]
if clusterID > 0 { // Skip cluster 0 (no cluster)
mention := EntityMention{
Text: token,
StartPos: tokenOffsets[i].Start,
EndPos: tokenOffsets[i].End,
IsEntity: isPIIEntity(tokens[i], piiPredictions[i]),
}
corefClusters[clusterID] = append(corefClusters[clusterID], mention)
}
}
// Set cluster IDs on entities
for i := range entities {
entities[i].ClusterID = findClusterForEntity(entities[i], corefClusters)
}
return DetectorOutput{
Entities: entities,
CorefClusters: corefClusters,
}, nil
Phase 4: Update Masking Service with Pronoun Substitution
File: src/backend/pii/masking_service.go
Extend the masking service to handle pronoun substitution:
type MaskingService struct {
detector detectors.Detector
generator *GeneratorService
pronounMapper *PronounMapper // NEW
}
func NewMaskingService(detector detectors.Detector, generator *GeneratorService) *MaskingService {
return &MaskingService{
detector: detector,
generator: generator,
pronounMapper: NewPronounMapper(), // NEW
}
}
func (s *MaskingService) MaskText(text string, logPrefix string) MaskedResult {
piiFound, err := s.detector.Detect(context.Background(), detectors.DetectorInput{Text: text})
// ... existing PII detection code ...
// NEW: Handle pronoun substitution
genderMappings := make(map[int]struct{
OriginalGender PronounGender
MaskedGender PronounGender
})
// Determine gender change for each cluster
for clusterID, mentions := range piiFound.CorefClusters {
originalGender := s.detectClusterGender(mentions, entities)
maskedGender := s.detectMaskedGender(mentions, entities, maskedToOriginal)
genderMappings[clusterID] = struct{
OriginalGender PronounGender
MaskedGender PronounGender
}{
OriginalGender: originalGender,
MaskedGender: maskedGender,
}
}
// Replace pronouns in clusters that have gender changes
for clusterID, genderMap := range genderMappings {
if genderMap.OriginalGender != genderMap.MaskedGender {
maskedText = s.replaceClusterPronouns(
maskedText,
piiFound.CorefClusters[clusterID],
genderMap.OriginalGender,
genderMap.MaskedGender,
)
}
}
return MaskedResult{
MaskedText: maskedText,
MaskedToOriginal: maskedToOriginal,
Entities: entities,
GenderMappings: genderMappings, // Store for restoration
}
}
func (s *MaskingService) RestorePII(text string, result MaskedResult) string {
// Restore PII entities
restoredText := text
for maskedText, originalText := range result.MaskedToOriginal {
restoredText = strings.ReplaceAll(restoredText, maskedText, originalText)
}
// NEW: Reverse pronoun substitutions
for clusterID, genderMap := range result.GenderMappings {
if genderMap.OriginalGender != genderMap.MaskedGender {
// Reverse: masked → original gender
restoredText = s.reverseClusterPronouns(
restoredText,
clusterID,
genderMap.MaskedGender,
genderMap.OriginalGender,
)
}
}
return restoredText
}
Phase 5: Add Configuration and Testing
File: src/backend/config/config.go
Add configuration option to enable/disable pronoun substitution:
type Config struct {
// ... existing fields ...
EnablePronounSubstitution bool `json:"enable_pronoun_substitution"`
}
File: src/backend/pii/detectors/model_detector_test.go
Add comprehensive tests:
func TestCorefPronounSubstitution(t *testing.T) {
tests := []struct {
name string
input string
expectedMasked string
expectedRestored string
}{
{
name: "male to female name change",
input: "Tom Miller went to his car. He drove home.",
expectedMasked: "Sarah Smith went to her car. She drove home.",
expectedRestored: "Tom Miller went to his car. He drove home.",
},
{
name: "female to male name change",
input: "Sarah went to her office. She worked late.",
expectedMasked: "John went to his office. He worked late.",
expectedRestored: "Sarah went to her office. She worked late.",
},
{
name: "reflexive pronouns",
input: "John introduced himself to the team.",
expectedMasked: "Mary introduced herself to the team.",
expectedRestored: "John introduced himself to the team.",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Test implementation
})
}
}
Integration Points
1. Model Inference
The model already outputs co-reference predictions. We need to:
- Extract
coref_logits from model output (already done in Python)
- Pass cluster information to Go backend via detector interface
- Map tokens to cluster IDs
2. PII Masking
When masking PII:
- Detect PII entities and their cluster IDs
- Identify pronouns in the same cluster
- Determine gender change (male→female, female→male, etc.)
- Replace pronouns with appropriate forms
3. PII Restoration
When restoring original PII:
- Restore masked entities (existing functionality)
- Reverse pronoun substitutions using stored gender mappings
- Ensure pronouns match original text
Example Scenarios
Scenario 1: Biography Request
Input: "His name is Tom Miller. Write about him."
Masked: "Her name is Sarah Smith. Write about her."
LLM Out: "Sarah Smith is an engineer. She graduated..."
Restored: "Tom Miller is an engineer. He graduated..."
Scenario 2: Multiple Entities
Input: "Tom met Sarah. He thanked her for the help."
Masked: "Lisa met John. She thanked him for the help."
LLM Out: "Lisa met John. She thanked him warmly."
Restored: "Tom met Sarah. He thanked her warmly."
Scenario 3: Reflexive Pronouns
Input: "Tom introduced himself to the CEO."
Masked: "Emma introduced herself to the CEO."
LLM Out: "Emma introduced herself professionally."
Restored: "Tom introduced himself professionally."
Success Criteria
Technical Challenges
1. Gender Detection
- Challenge: Determining gender from masked names
- Solution: Use name-based heuristics + fallback to neutral pronouns
2. Pronoun Ambiguity
- Challenge: Words like "her" can be possessive or object
- Solution: Context-aware mapping based on surrounding words
3. Multiple Entities
- Challenge: Handling multiple entities with different genders
- Solution: Track each cluster separately with independent gender mappings
4. Cross-sentence References
- Challenge: Pronouns may refer to entities in previous sentences
- Solution: Use co-reference clusters that span entire text
Future Enhancements
- Advanced Gender Detection: Use external name-gender databases
- Neutral Pronoun Support: Better handling of they/them pronouns
- Language Support: Extend to other languages beyond English
- LLM-based Detection: Use LLM to determine appropriate pronouns
- User Preferences: Allow users to specify gender preferences
References
Notes
This feature significantly improves the quality of PII-protected LLM interactions by maintaining grammatical consistency. The co-reference detection model is already trained and functional - we just need to leverage its output in the masking/restoration pipeline.
Complexity: Medium
Impact: High (better user experience, more natural responses)
Dependencies: Requires model co-reference output (already available)
Integrate Co-reference Detection for Pronoun Substitution in PII Masking
Background
The Yaak proxy currently detects and masks PII entities (names, emails, etc.) to protect user privacy when sending requests to external LLMs. However, it doesn't handle pronouns that refer to these masked entities, leading to gender/reference mismatches in responses.
The Problem
When a male name like "Tom Miller" is replaced with a female name like "Sarah Smith", associated pronouns ("he", "him", "his") remain unchanged, causing grammatical inconsistencies and confusion:
Current Behavior (Broken):
Ideal Behavior (Fixed):
What is Co-reference Detection?
Co-reference detection identifies and links mentions of the same entity across a text:
The Yaak model already performs co-reference detection and outputs cluster IDs for each token. We just need to use this information to handle pronoun substitution.
Current Model Capabilities
The multi-task PII detection model outputs co-reference predictions:
Example output:
The model correctly identifies that "Tom Miller", "his", and "He" all refer to the same entity (cluster 1).
Implementation Plan
Phase 1: Add Pronoun Mapping Module
File:
src/backend/pii/pronoun_mapper.go(new)Create a pronoun mapping service that handles gender-aware pronoun substitution:
Phase 2: Extend Detector Output with Co-reference Information
File:
src/backend/pii/detectors/types.goAdd co-reference cluster information to detector output:
Phase 3: Update Model Detector to Extract Co-references
File:
src/backend/pii/detectors/model_detector.goModify the model detector to extract co-reference information from model output:
Phase 4: Update Masking Service with Pronoun Substitution
File:
src/backend/pii/masking_service.goExtend the masking service to handle pronoun substitution:
Phase 5: Add Configuration and Testing
File:
src/backend/config/config.goAdd configuration option to enable/disable pronoun substitution:
File:
src/backend/pii/detectors/model_detector_test.goAdd comprehensive tests:
Integration Points
1. Model Inference
The model already outputs co-reference predictions. We need to:
coref_logitsfrom model output (already done in Python)2. PII Masking
When masking PII:
3. PII Restoration
When restoring original PII:
Example Scenarios
Scenario 1: Biography Request
Scenario 2: Multiple Entities
Scenario 3: Reflexive Pronouns
Success Criteria
PronounMappermodule created with gender mapping tablesTechnical Challenges
1. Gender Detection
2. Pronoun Ambiguity
3. Multiple Entities
4. Cross-sentence References
Future Enhancements
References
model/src/eval_model.pymodel/src/eval_model_detailed.py:298-308Notes
This feature significantly improves the quality of PII-protected LLM interactions by maintaining grammatical consistency. The co-reference detection model is already trained and functional - we just need to leverage its output in the masking/restoration pipeline.
Complexity: Medium
Impact: High (better user experience, more natural responses)
Dependencies: Requires model co-reference output (already available)