-
Notifications
You must be signed in to change notification settings - Fork 135
Open
Description
Add Prompt Caching Support to Reduce API Costs
Overview
I implemented prompt caching in my fork based on Anthropic's recent updates and saw significant cost reductions from $30-50/day to $1-5/day in API usage.
Implementation Details
The key changes are based on this upstream commit:
anthropics/claude-quickstarts@be847c4
The main implementation can be found here:
https://github.com/anthropics/anthropic-quickstarts/blob/main/computer-use-demo/computer_use_demo/loop.py#L116
Benefits
- Cost Reduction: 90-95% reduction in API costs
- Performance: Faster response times due to cached prompts
- Efficiency: Better handling of repetitive tasks and system prompts
Technical Changes Required
- Add the
anthropic-beta: prompt-caching-2024-07-31header - Structure system prompts and tools with
cache_control - Implement cache tracking for performance monitoring
Example Implementation
system_content = [
{
"type": "text",
"text": "Base system instructions...",
},
{
"type": "text",
"text": "Large context or tool definitions...",
"cache_control": {"type": "ephemeral"}
}
]Notes
- My branch is currently ahead of the main, making direct PR difficult
- The repo seems relatively inactive, so I am sharing as a ticket for future reference
- Implementation follows Anthropic's official prompt caching guidelines
References
Metadata
Metadata
Assignees
Labels
No labels