Skip to content

Conversation

@minzh23
Copy link

@minzh23 minzh23 commented Oct 17, 2025

Summary

This PR adds Cache-to-Cache, a novel research work on direct semantic communication between Large Language Models via KV-Cache.

What is Cache-to-Cache?

Cache-to-Cache introduces a new paradigm for inter-LLM communication that goes beyond traditional text-based approaches. Instead of forcing models to communicate through token sequences, Cache-to-Cache enables direct semantic transfer via KV-Cache manipulation.

Key Features

  • Direct Semantic Communication: LLMs communicate through KV-Cache instead of text, preserving rich semantic information
  • Eliminates Generation Latency: Avoids token-by-token generation overhead
  • Neural Cache Fusion: Uses a neural network to project and fuse KV-caches between models
  • Superior Performance: Achieves significant improvements over both individual models and text-based communication

Category

Multi-agent, Communication

Changes Made

  • Added Cache-to-Cache entry in alphabetical order (between "bumpgen" and "Cal.ai")
  • Included comprehensive description of the research contribution
  • Added complete links: Project Page, Paper (arXiv), GitHub repository, and HuggingFace collection

Resources

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant