Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion .claude/settings.local.json
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,19 @@
"Bash(npm run build:*)",
"Bash(curl:*)",
"mcp__sequential-thinking__sequentialthinking",
"Bash(gh repo set-default:*)"
"Bash(gh repo set-default:*)",
"Bash(cd:*)",
"Bash(cd:*)",
"Bash(cd:*)",
"Bash(cd:*)",
"Bash(cd:*)",
"Bash(cd:*)",
"Bash(pkill:*)",
"Bash(cd:*)",
"Bash(cd:*)",
"Bash(cd:*)",
"Bash(git fetch:*)",
"Bash(git merge:*)"
]
},
"enableAllProjectMcpServers": true,
Expand Down
365 changes: 365 additions & 0 deletions SPEC_PRP/PRPs/semantic-link-analysis-redesign.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,365 @@
# PRP: Semantic Link Analysis System Redesign

## Executive Summary
Transform the current analysis system to enable bi-directional semantic link discovery between blog posts using Claude AI. The system will analyze individual posts on-demand, finding both incoming link opportunities (from other posts) and outgoing link opportunities (to other posts), with queue-based batch processing, preview capabilities, and CSV export functionality.

## Current State Assessment

### State Documentation

```yaml
current_state:
files:
- src/app/api/analysis/start/route.ts # Batch analysis for all posts
- src/app/api/analysis/status/route.ts # Global analysis status
- src/lib/job-processor.ts # Parallel job processing
- src/lib/semantic-analyzer.ts # Claude API integration
- src/app/components/LinkReviewPanel.tsx # Link review UI

behavior:
- Analyzes all posts in database at once
- No single-post analysis capability
- Basic link review interface
- Limited control over analysis process
- No preview of link context

issues:
- Cannot analyze individual posts on demand
- No bi-directional link discovery
- Missing context preview for suggested links
- No queue visibility during analysis
- Limited to analyzing all posts at once

desired_state:
files:
- src/app/api/analysis/analyze-post/route.ts # Single post analysis
- src/app/api/analysis/queue/route.ts # Queue management
- src/lib/semantic-link-analyzer.ts # Enhanced Claude integration
- src/app/components/AnalysisTab.tsx # New analysis tab
- src/app/components/LinkPreview.tsx # Link context preview

behavior:
- Analyze individual posts on-demand
- Bi-directional link discovery (incoming & outgoing)
- Queue-based batch processing with visibility
- Preview link placement in context
- Confidence scores and reasoning display
- Export approved links as HTML in CSV

benefits:
- Targeted analysis of specific posts
- Better link quality with context preview
- Transparent queue progress
- Efficient batch processing
- Production-ready export format
```

## Hierarchical Objectives

### 1. High-Level: Enable On-Demand Bi-Directional Link Analysis
Transform the analysis system to support individual post analysis with both incoming and outgoing link discovery, queue management, and enhanced review capabilities.

### 2. Mid-Level Milestones

#### 2.1 Create Single-Post Analysis Infrastructure
- Add analyze button to individual posts
- Implement bi-directional link discovery
- Queue management with batch processing

#### 2.2 Enhance Link Review Experience
- Add Analysis tab to blog viewer
- Show link previews with context
- Display confidence scores and reasoning

#### 2.3 Implement Export Functionality
- Generate HTML links with proper attributes
- Export in original CSV format
- Include only approved links

### 3. Low-Level Tasks

## Task Specifications

### Task 1: Create Single Post Analysis Endpoint
```yaml
task_name: create_single_post_analysis_endpoint
action: CREATE
file: src/app/api/analysis/analyze-post/route.ts
changes: |
- Create POST endpoint accepting postId
- Queue analysis jobs for bi-directional links
- Find places in OTHER posts to link TO this post
- Find places in THIS post to link to OTHER posts
- Return job IDs for tracking
validation:
- command: "curl -X POST http://localhost:3000/api/analysis/analyze-post -d '{"postId": 1}'"
- expect: "Returns job queue information"
```

### Task 2: Implement Enhanced Semantic Analyzer
```yaml
task_name: create_enhanced_semantic_analyzer
action: CREATE
file: src/lib/semantic-link-analyzer.ts
changes: |
- Use Claude SDK (@anthropic-ai/sdk)
- Implement bi-directional analysis logic
- Extract link text (max 4 words)
- Calculate confidence scores
- Generate reasoning for each suggestion
- Handle rate limiting (max 20 concurrent)
validation:
- command: "npm run type-check"
- expect: "No type errors"
```

### Task 3: Create Analysis Queue Management
```yaml
task_name: create_queue_management_api
action: CREATE
file: src/app/api/analysis/queue/route.ts
changes: |
- GET endpoint for queue status
- Show queued, processing, completed jobs
- Real-time progress updates
- Support batch processing
validation:
- command: "curl http://localhost:3000/api/analysis/queue"
- expect: "Returns queue status with job details"
```

### Task 4: Add Analysis Tab Component
```yaml
task_name: create_analysis_tab_component
action: CREATE
file: src/app/components/AnalysisTab.tsx
changes: |
- Display incoming link suggestions
- Display outgoing link suggestions
- Show confidence scores and reasoning
- Approve/reject functionality
- Group by source/target post
validation:
- command: "npm run dev"
- expect: "Analysis tab displays link suggestions"
```

### Task 5: Implement Link Preview Component
```yaml
task_name: create_link_preview_component
action: CREATE
file: src/app/components/LinkPreview.tsx
changes: |
- Show surrounding context (before/after)
- Highlight proposed link text
- Display as it would appear with HTML
- Show target post title on hover
validation:
- command: "npm run dev"
- expect: "Link previews show context"
```

### Task 6: Update EnhancedBlogViewer
```yaml
task_name: update_blog_viewer_for_analysis
action: MODIFY
file: src/app/components/EnhancedBlogViewer.tsx
changes: |
- Add "Analyze" button to header
- Add "Analysis" tab to tab list
- Integrate AnalysisTab component
- Handle analysis state and progress
validation:
- command: "npm run type-check"
- expect: "No type errors"
```

### Task 7: Enhance Export with HTML Links
```yaml
task_name: enhance_csv_export
action: MODIFY
file: src/app/api/export/csv/route.ts
changes: |
- Apply approved links as HTML <a> tags
- Include proper rel attributes
- Maintain original CSV format
- Only include approved links
validation:
- command: "npm run test-export"
- expect: "CSV contains HTML links"
```

### Task 8: Update Claude Prompt Strategy
```yaml
task_name: update_analysis_prompt
action: CREATE
file: src/lib/prompts/semantic-link-prompt.ts
changes: |
- Adapt prompt from example project
- Focus on 2-4 word link text
- Emphasize high-quality connections
- Return structured JSON response
- 70% confidence threshold
validation:
- command: "npm run type-check"
- expect: "Valid prompt structure"
```

### Task 9: Implement Queue Progress UI
```yaml
task_name: create_queue_progress_component
action: CREATE
file: src/app/components/QueueProgress.tsx
changes: |
- Real-time queue status display
- Show processing progress
- List active analysis jobs
- Cancel functionality
validation:
- command: "npm run dev"
- expect: "Queue progress displays correctly"
```

### Task 10: Add Database Schema Updates
```yaml
task_name: update_database_schema
action: MODIFY
file: prisma/schema.prisma
changes: |
- Add analysisType to AnalysisJob
- Add queuePosition field
- Add batchId for grouping
- Run migration
validation:
- command: "npx prisma migrate dev"
- expect: "Migration successful"
```

## Implementation Strategy

### Dependencies Order
1. Database schema updates first (Task 10)
2. Core analysis infrastructure (Tasks 1, 2, 3)
3. UI components (Tasks 4, 5, 6, 9)
4. Prompt optimization (Task 8)
5. Export enhancement (Task 7)

### Claude API Integration
```typescript
// Example structure for semantic analysis
interface AnalysisResult {
sourcePostId: number;
targetPostId: number;
linkText: string; // 2-4 words
linkPosition: number;
confidence: number; // 0-100
reasoning: string;
contextBefore: string;
contextAfter: string;
}
```

### Rate Limiting Strategy
- Maximum 20 concurrent Claude API calls
- Queue-based processing with backpressure
- Retry failed analyses up to 3 times
- Exponential backoff for rate limits

## Risk Assessment

### Identified Risks
1. **API Costs**: Multiple analyses per post
- Mitigation: Efficient prompts, caching results

2. **Performance**: Large-scale analysis load
- Mitigation: Queue management, batch processing

3. **Link Quality**: False positives
- Mitigation: 70% confidence threshold, preview

4. **User Experience**: Complex approval process
- Mitigation: Intuitive UI with bulk actions

## User Interaction Points

### Analysis Workflow
1. User clicks "Analyze" on a blog post
2. System queues bi-directional analysis
3. Progress shown in real-time
4. Results appear in Analysis tab
5. User reviews with preview capability
6. Approve/reject individual links
7. Export includes approved links

### Bulk Operations
- Approve all high-confidence links (85%+)
- Filter by confidence level
- Sort by relevance score
- Group by source/target post

## Success Criteria
- [ ] Individual posts can be analyzed on-demand
- [ ] Bi-directional links discovered accurately
- [ ] Queue progress visible in real-time
- [ ] Link previews show context clearly
- [ ] Confidence scores guide decisions
- [ ] Export produces valid HTML links
- [ ] Rate limiting prevents API overuse

## Technical Specifications

### Prompt Structure (Adapted)
```typescript
const ANALYSIS_PROMPT = `
You are a conservative semantic link analyst...

CRITICAL: Only suggest a link if ALL criteria are met:
1. STRONG SEMANTIC RELEVANCE
2. USER VALUE at specific point
3. NATURAL CONTEXT flow
4. SPECIFIC CONNECTION
5. CLEAR USER INTENT
6. NO EXISTING LINK
7. LINK TEXT LENGTH: 2-4 words only

Respond with JSON:
{
"shouldLink": boolean,
"linkText": "2-4 word phrase",
"confidence": 0-100,
"reasoning": "explanation"
}
`;
```

### Export Format
```csv
Content,"<p>Learn about <a href=""/blog/target-slug"" rel=""related"">service management</a> best practices...</p>"
```

## Notes
- Leverage existing Prisma models and infrastructure
- Maintain compatibility with current link review system
- Focus on quality over quantity in suggestions
- Ensure single link per target post rule is enforced

## Implementation Updates (Not in Original PRP)

### Additional Changes Made During Implementation

1. **Prisma Client Regeneration**: After database schema changes, must run `npx prisma generate` and restart the development server for changes to take effect.

2. **Claude API Integration**:
- Attempted to use `@anthropic-ai/claude-code` package for authentication-free API access
- The package's `query` function returns an async generator of `SDKMessage` objects
- SDKMessage structure differs from standard Anthropic SDK response format
- Message content extraction requires checking multiple properties (`text`, `content`) with type guards

3. **Actual Implementation Differences**:
- Created `semantic-link-analyzer.ts` as a new file rather than modifying existing `semantic-analyzer.ts`
- Implemented phrase extraction to analyze multiple 2-4 word segments per post
- Added proper type handling for different SDKMessage types from claude-code package

4. **Authentication Consideration**:
- The `@anthropic-ai/claude-code` package is designed for CLI environments
- For web applications, may need alternative authentication approach or environment setup
Loading