[Feature Request] Add context window management and conversation summarization support

## Summary

When using kagent with models that have limited context windows (e.g., `gpt-4o-mini` with 128k tokens), agents with many MCP tools can exceed the context limit, causing "Session terminated" errors.

Currently, kagent does not expose ADK's context management features through the Agent CRD.

## Problem

With an agent configured with ~40+ MCP tools, the context quickly exceeds limits:
- Each tool definition adds tokens to the context
- Conversation history accumulates
- No way to truncate, summarize, or manage context programmatically

Error observed:
```
context_length_exceeded: This model's maximum context length is 128000 tokens. 
However, your messages resulted in 155XXX tokens.
```

## Proposed Solution

Expose [ADK's context management capabilities](https://google.github.io/adk-docs/context/) through the Agent CRD:

### Option 1: Context Compression (Compaction)
ADK has [Context Compression](https://google.github.io/adk-docs/context/compaction/) that summarizes older events:

```yaml
apiVersion: kagent.dev/v1alpha2
kind: Agent
spec:
  declarative:
    contextCompression:
      enabled: true
      compactionInterval: 5  # Compress every 5 events
      overlapSize: 1
      summarizer:
        type: llm  # Use LlmEventSummarizer
        model: gpt-4o-mini
```

### Option 2: Context Caching
ADK has [Context Caching](https://google.github.io/adk-docs/context/caching/) for optimizing repeated requests:

```yaml
spec:
  declarative:
    contextCache:
      enabled: true
      minTokens: 1000
      ttlSeconds: 3600
      cacheIntervals: 10
```

### Option 3: Context Window Compression Config
From the [ADK API](https://google.github.io/adk-docs/api-reference/python/google-adk.html), there's `ContextWindowCompressionConfig`:

```yaml
spec:
  declarative:
    contextWindowCompression:
      enabled: true
      maxLength: 100000  # Max context length before compression
```

## Alternatives Considered

1. **Reduce tools**: Works but limits agent capabilities
2. **Use larger context models**: More expensive (gpt-4o vs gpt-4o-mini)
3. **Memory/RAG**: Currently not supported in ADK (v0.6 release notes confirm this)

## Additional Context

- ADK already implements these features, they just need to be exposed in the CRD
- This is critical for production deployments with complex agents
- ADK documentation: https://google.github.io/adk-docs/context/

## Environment

- kagent version: v0.7.6
- Kubernetes: OpenShift 4.17
- Model: gpt-4o-mini (128k context)
- Tools: 16 MCP tools (reduced from 42 to mitigate issue)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Add context window management and conversation summarization support #1173

Summary

Problem

Proposed Solution

Option 1: Context Compression (Compaction)

Option 2: Context Caching

Option 3: Context Window Compression Config

Alternatives Considered

Additional Context

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Add context window management and conversation summarization support #1173

Description

Summary

Problem

Proposed Solution

Option 1: Context Compression (Compaction)

Option 2: Context Caching

Option 3: Context Window Compression Config

Alternatives Considered

Additional Context

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions