MCP CPU Spike

# MCP CPU Spike

## Summary

When making MCP calls through the responses API, the llamastack server process CPU usage spikes to 100% and remains there indefinitely, even after the request completes.

## Environment

- **LLamaStack version**: main branch
- **Python version**: 3.12

## Steps to Reproduce

1. Start llamastack server:
2. Verify CPU usage is idle (0-1%)
3. Make an MCP call via responses API:
4. Monitor CPU usage with `top`

## Expected Behavior

CPU usage should return to idle levels (0-1%) after the MCP request completes.

## Actual Behavior

CPU usage spikes to 100% and stays there indefinitely:

```
# Before MCP call - idle
3176764 derekh    20   0 4196036 442736 137316 S   0.0   1.4   0:06.55 llama stack run

# After MCP call - stuck at 100%
3176764 derekh    20   0 4422628 448496 137444 R  99.7   1.4   0:34.23 llama stack run
```

## Root Cause

It looks like the issue is caused by the MCP session caching mechanism (MCPSessionManager) that
was added to optimize performance by avoiding redundant tools/list calls (fix for #4452).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MCP CPU Spike #4754

MCP CPU Spike

Summary

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MCP CPU Spike #4754

Description

MCP CPU Spike

Summary

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions