Closed
Description
I have a question about the feature of efficient memory sharing. Does different sequences that sharing the same system prompt but splicing different user-input texts share the computation and memory for the same system prompt?
For example, here are two input sequences:
- <|system|>You are a kind robot. <|user|>How's the weather today.
- <|system|>You are a kind robot. <|user|>Tell me a story.
Would this two input sequences share the computation and memory for the same system prompt of "<|system|>You are a kind robot. <|user|>"?