Feature Request: Enable Prompt Caching in PromptTemplate for ChatAnthropic #29747

DonghaeSuh · 2025-02-12T02:50:03Z

DonghaeSuh
Feb 12, 2025

Checked

I searched existing ideas and did not find a similar one
I added a very descriptive title
I've clearly described the feature request and motivation for it

Feature request

When using PromptTemplate to load a string-based prompt template, it must be converted into messages before being passed to a ChatModel.

This feature introduces a simple prompt_caching argument that allows users to specify how {placeholders} should define KV caching boundaries. During invoke(), the template is automatically segmented based on the provided placeholders, ensuring that Anthropic’s caching breakpoints {"cache_control": {"type": "ephemeral"}} are inserted at the intended positions.

By applying this caching logic, we can enhance the usage of PromptTemplate | ChatModel chains without requiring manual prompt restructuring (splitting → converting to messages → adding cache_control arguments).

This significantly improves usability in real-world applications where predefined .txt prompt files are dynamically loaded into PromptTemplate.

Motivation

Currently, when using langchain_anthropic.ChatAnthropic in a production service, we store multiple prompts as .txt files and dynamically load them into PromptTemplate.
However, to leverage prompt caching (KV caching), we must manually split these whole string prompts into messages format and explicitly insert {"cache_control": {"type": "ephemeral"}} at the intended positions.

Since the primary purpose of KV caching is to cache the static portion of a prompt, an automated way to determine the breakpoint based on {placeholders} would significantly reduce manual overhead.

This feature would allow users to specify whether a {placeholder} should be included in the cached portion or not, simplifying prompt caching in ChatAnthropic without requiring additional manual prompt restructuring.

Proposal (If applicable)

Implementation Details

Pass prompt_caching argument as an optional keyword argument (kwargs) when invoking PromptTemplate.
When prompt_caching argument is provided, store the following three attributes inside the returned StringPromptValue:
- template: The original prompt string with {placeholders} intact.
- input_dict: The dictionary of {"input variable": "value"} passed during invocation.
- prompt_caching: A dictionary mapping {placeholder} names to "front" or "back", or a boolean True.
  - "front": Excludes the placeholder from caching.
  - "back": Includes the placeholder in caching.
  - True: Caches the entire prompt (useful for static prompts with no placeholders).
prompt caching argument is designed to support up to 4 key-value pairs, which corresponds to the maximum number of breakpoints supported by Anthropic's prompt caching. For each breakpoint, the prompt will be split into chunks and cached accordingly.
Modify to_messages() function in StringPromptValue to automatically split the message based on the stored attributes, ensuring that {"cache_control": {"type": "ephemeral"}} is inserted at the appropriate breakpoints.

Example Usage

from langchain_core.prompts import PromptTemplate

string_prompt_template = """## Instruction ##
You are a {role}.
## Relevant Documents ##
{documents}
## User Question ##
{question}
## Answer ##"""

prompt_template = PromptTemplate.from_template(string_prompt_template)
prompt_template.invoke(
    input={"role": "Human", "documents": "relevant documents", "question": "What is this?"},
    prompt_caching={"documents": "back"}
)

In this example, the {documents} section and everything before it will be kv cached (if it exceeds 1024 or 2048 tokens, depending on the model type : link), while the rest will be not cached.

I have also confirmed the demand for this feature in #27340, #26701
I appreciate any feedback or suggestions to refine this proposal. If there are specific requirements or best practices to consider, please let me know!

I'm prepared to submit a pull request or open an issue for this feature—guidance on the preferred contribution process would be helpful.

junha-lee · 2025-02-21T03:30:28Z

junha-lee
Feb 21, 2025

Just what I needed!
Thank you~!

Let's code it so it works in gemini too~.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Enable Prompt Caching in PromptTemplate for ChatAnthropic #29747

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Feature Request: Enable Prompt Caching in PromptTemplate for ChatAnthropic #29747

DonghaeSuh Feb 12, 2025

Checked

Feature request

Motivation

Proposal (If applicable)

Replies: 1 comment

junha-lee Feb 21, 2025

DonghaeSuh
Feb 12, 2025

junha-lee
Feb 21, 2025