[Feature] Event Correlation by ID (request_id, trace_id, user_id)

## Feature Description
Enable users to click any identifier (request_id, trace_id, user_id, order_id, etc.) in a log entry and instantly see all related logs across services, time ranges, and log sources. This creates a narrative timeline of events related to a specific transaction, user action, or request.

## Problem/Use Case
**Current problem:**
- Debugging distributed systems requires manually searching for request IDs across multiple log sources
- Users must copy-paste IDs and run multiple searches to piece together what happened
- No visual timeline showing the sequence of events across services
- Difficult to understand causal relationships between log entries
- Time-consuming to debug incidents involving multiple microservices

**Real-world scenario:**
```
1. User reports: "My payment failed at 14:32"
2. DevOps finds error log with request_id: req_abc123
3. Needs to search manually:
   - API gateway logs (initial request)
   - Auth service logs (token validation)
   - Payment service logs (charge attempt)
   - Database logs (transaction records)
   - Queue logs (webhook processing)
4. Must manually correlate timestamps and piece together the story
5. Takes 20-30 minutes to reconstruct what happened
```

**With event correlation:**
```
1. Click request_id: req_abc123
2. See complete timeline:
   14:32:01.234 [API Gateway] Request received
   14:32:01.456 [Auth] Token validated
   14:32:02.123 [Payment] Stripe API called
   14:32:03.789 [Payment] ERROR: Card declined ← root cause
   14:32:04.012 [Database] Transaction rolled back
   14:32:04.156 [Queue] Webhook retry scheduled
3. Problem identified in 30 seconds
```

## Proposed Solution

**Core feature: Auto-detect and link common identifiers**

**Phase 1: ID Detection & Linking**
- Auto-detect common ID patterns in logs:
  - UUID format (8-4-4-4-12)
  - Request IDs (req_*, request-*, etc.)
  - Trace IDs (trace_*, span_*, etc.)
  - User IDs (user_*, uid_*, etc.)
  - Transaction IDs (txn_*, order_*, etc.)
  - Correlation IDs (correlation_*, x-correlation-id)

**Phase 2: UI/UX**
```
Log entry view:
┌─────────────────────────────────────────────────────┐
│ 2025-01-15 14:32:03 ERROR Payment service          │
│ Card declined for user_789                         │
│                                                     │
│ request_id: req_abc123  ← clickable, highlighted   │
│ user_id: user_789      ← clickable, highlighted    │
│ transaction_id: txn_xyz ← clickable, highlighted   │
└─────────────────────────────────────────────────────┘

On click → Opens correlation view:
┌─────────────────────────────────────────────────────┐
│ Timeline for request_id: req_abc123                 │
│                                                     │
│ ▼ 14:32:01.234 [API Gateway]                       │
│   POST /api/payment received                        │
│                                                     │
│ ▼ 14:32:01.456 [Auth Service]                      │
│   Token validated for user_789                      │
│                                                     │
│ ▼ 14:32:02.123 [Payment Service]                   │
│   Stripe API called                                 │
│                                                     │
│ ⚠ 14:32:03.789 [Payment Service] ERROR             │
│   Card declined: insufficient_funds                 │
│                                                     │
│ ▼ 14:32:04.012 [Database]                          │
│   Transaction rolled back                           │
└─────────────────────────────────────────────────────┘
```

**Phase 3: Advanced Correlation**
- Correlation across multiple IDs (e.g., same user_id + different request_ids)
- Waterfall view showing service dependencies
- Automatic time range expansion (search ±5 minutes from clicked log)

## Alternatives Considered

1. **Manual search only** - Current state, too time-consuming
2. **OpenTelemetry traces required** - Too heavyweight, not all apps use OTLP
3. **Pre-defined correlation rules** - Too rigid, doesn't handle custom IDs
4. **Graph database for relationships** - Over-engineered, adds complexity
5. **Regex-based correlation** - Part of solution, but needs smart defaults

**Chosen approach:** Smart auto-detection with user-configurable patterns

## Implementation Details (Optional)

**Technical approach:**

**1. ID Extraction at Ingestion**
```typescript
// During log ingestion, extract structured IDs
interface ExtractedIDs {
  request_id?: string;
  trace_id?: string;
  span_id?: string;
  user_id?: string;
  transaction_id?: string;
  custom_ids: Record<string, string>;
}

function extractIDs(logEntry: LogEntry): ExtractedIDs {
  const patterns = {
    request_id: /(?:request_id|req_id|requestId)[:\s=]+([a-zA-Z0-9_-]+)/,
    trace_id: /(?:trace_id|traceId|x-trace-id)[:\s=]+([a-f0-9-]+)/,
    user_id: /(?:user_id|userId|uid)[:\s=]+([a-zA-Z0-9_-]+)/,
    // ... more patterns
  };
  
  // Also check structured fields (JSON logs)
  // Also check OTLP attributes
  
  return extractedIDs;
}
```

**2. Database Schema**
```sql
-- Store extracted IDs for fast correlation queries
CREATE TABLE log_identifiers (
  log_id UUID REFERENCES logs(id),
  identifier_type VARCHAR(50), -- 'request_id', 'user_id', etc.
  identifier_value VARCHAR(255),
  timestamp TIMESTAMPTZ,
  
  PRIMARY KEY (log_id, identifier_type),
  INDEX idx_identifier_lookup (identifier_type, identifier_value, timestamp)
);

-- Query for correlation:
SELECT l.* 
FROM logs l
JOIN log_identifiers li ON l.id = li.log_id
WHERE li.identifier_type = 'request_id' 
  AND li.identifier_value = 'req_abc123'
ORDER BY l.timestamp;
```

**3. UI Implementation**
```typescript
// Make IDs clickable in log viewer
function renderLogMessage(message: string, extractedIDs: ExtractedIDs) {
  let rendered = message;
  
  for (const [type, value] of Object.entries(extractedIDs)) {
    rendered = rendered.replace(
      value,
      `<a class="correlation-link" data-type="${type}" data-value="${value}">
        ${value}
      </a>`
    );
  }
  
  return rendered;
}

// Handle click
function onCorrelationLinkClick(type: string, value: string) {
  // Open correlation view modal or sidebar
  showCorrelationTimeline(type, value);
}
```

**4. Configuration UI**
```yaml
# User-configurable correlation patterns
correlation_patterns:
  - name: "Request ID"
    pattern: "(?:request_id|req)[:\s=]+([a-zA-Z0-9_-]+)"
    enabled: true
  
  - name: "Order ID"
    pattern: "(?:order_id|order)[:\s=]+([a-zA-Z0-9_-]+)"
    enabled: true
    
  - name: "Custom Job ID"
    pattern: "job_([a-f0-9-]+)"
    enabled: true
```

**Performance considerations:**
- Index `log_identifiers` table heavily
- Limit correlation queries to reasonable time windows (default ±5min, max ±1 hour)
- Cache common correlation queries
- Lazy-load timeline entries (load more as user scrolls)

## Priority
- [ ] Critical - Blocking my usage of LogTide
- [x] High - Would significantly improve my workflow
- [ ] Medium - Nice to have
- [ ] Low - Minor enhancement

**Rationale:** This is a **core differentiator** that transforms Logtide from "log search" to "incident investigation tool". It's the kind of feature that's genuinely hard to replicate and provides massive time savings during critical debugging sessions.

## Target Users
- [x] DevOps Engineers (primary: incident response)
- [x] Developers (primary: debugging distributed systems)
- [ ] Security/SIEM Users (secondary benefit)
- [x] System Administrators
- [ ] All Users

**Primary audience:** Teams running microservices, distributed systems, or any architecture where a single user action involves multiple services/components.

## Additional Context

**Why this is a moat:**
- Grep can't do this across services
- Basic log search tools don't provide timeline visualization
- Implementing well requires deep understanding of distributed tracing concepts
- The UX matters enormously (auto-detection vs manual configuration)

**Competitive analysis:**
- **Datadog APM:** Has this via distributed tracing, but requires APM agent installation ($$$)
- **Elastic APM:** Similar to Datadog, heavyweight setup
- **Grafana Loki:** No built-in correlation, requires LogQL wizardry
- **Splunk:** Has transaction correlation, but enterprise-only and complex
- **Logtide advantage:** Works with plain logs, no agents required

**User testimonial (hypothetical):**
> "Before Logtide correlation: 20 minutes to debug a payment failure across 5 services. After: 30 seconds. I click the request_id and see the entire story."

**Marketing message:**
> "Click any ID, see the whole story. Logtide automatically correlates logs across your entire system. No agents, no distributed tracing setup required."

**Future enhancements:**
- Service dependency graph (automatically discover which services talk to each other)
- Anomaly highlighting (automatically flag unusual patterns in correlated logs)
- Export timeline as shareable link for incident reports
- Integration with alerting (when alert fires, show correlated timeline)

**Example workflows:**

**Debugging production incident:**
```
1. Alert fires: "Payment service errors spiking"
2. Click alert → See recent error logs
3. Click request_id on any error
4. Timeline shows:
   - Frontend made request
   - API validated input
   - Payment service called Stripe
   - Stripe returned timeout
   - Database transaction rolled back
   - Queue scheduled retry
5. Root cause identified: Stripe API degradation
```

**Understanding user journey:**
```
1. Support ticket: "User says checkout is broken"
2. Search for user_id: user_12345
3. See all actions for that user:
   - Viewed product page
   - Added to cart
   - Started checkout
   - ERROR: Tax calculation failed
   - Abandoned cart
4. Fix tax calculation bug
```

## Contribution
- [ ] I would like to work on implementing this feature

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Event Correlation by ID (request_id, trace_id, user_id) #89

Feature Description

Problem/Use Case

Proposed Solution

Alternatives Considered

Implementation Details (Optional)

Priority

Target Users

Additional Context

Contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Event Correlation by ID (request_id, trace_id, user_id) #89

Description

Feature Description

Problem/Use Case

Proposed Solution

Alternatives Considered

Implementation Details (Optional)

Priority

Target Users

Additional Context

Contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions