Wire protocol for AI agent communication with inspectable headers and semantic security.
M2M Frame (Application Layer - transported over HTTP/QUIC)
┌─────────┬──────────────────────────────┬────────────────────┬─────────┬───────┬─────────────┐
│ Prefix │ Fixed Header (20B) │ Routing Header │ Payload │ CRC32 │ Payload │
│ #M2M|1| │ │ (variable) │ Len 4B │ 4B │ (compress) │
├─────────┼───────┬────┬────┬──────┬─────┼────────────────────┼─────────┴───────┴─────────────┤
│ │HdrLen │Sch │Sec │Flags │Rsrv │ Model (len+str) │ │
│ │ 2B │ 1B │ 1B │ 4B │ 12B │ MsgCount (varint) │ Brotli-compressed JSON │
│ │ │ │ │ │ │ Roles (2b packed) │ (100% fidelity) │
│ │ │ │ │ │ │ ContentHint (var) │ │
│ │ │ │ │ │ │ MaxTokens (var) │ │
│ │ │ │ │ │ │ CostEst (f32) │ │
├─────────┴───────┴────┴────┴──────┴─────┴────────────────────┼───────────────────────────────┤
│ ▲ Readable without decompression │ ▲ Requires decode │
└─────────────────────────────────────────────────────────────┴───────────────────────────────┘
Security Modes (headers always readable):
None: [headers][payload_len][crc32][payload]
HMAC: [headers][payload_len][crc32][payload][hmac_tag:32B]
AEAD: [headers][nonce:12B][encrypt(payload_len+crc32+payload)+tag:16B]
▲ headers remain readable ▲
Sch: 0x01=Request 0x02=Response 0x03=Stream 0x10=Error
Sec: 0x00=None 0x01=HMAC-SHA256 0x02=ChaCha20-Poly1305
Cognitive security (threat detection) operates pre-transmission — see SecurityScanner.
When AI agents communicate at scale, three problems emerge that traditional tools can't solve:
Traditional compression (gzip, brotli, zstd) reduces bytes but produces binary output requiring Base64 encoding. This increases token count:
Original JSON: 68 bytes → 42 tokens
Gzip + Base64: 52 bytes → 58 tokens (+38% tokens)
Binary data tokenizes poorly (~1 byte/token) compared to text (~4 chars/token). For agent-to-agent traffic where latency matters, you're transmitting more data after "compression."
Compressed traffic is opaque. Load balancers, API gateways, and observability tools must decompress every payload to make routing decisions. At scale, this adds latency and complexity.
Network security (TLS, firewalls, WAFs) operates at the packet level. It can't understand what agents are saying to each other. Prompt injection, jailbreaks, and data exfiltration attempts pass through undetected.
M2M solves all three.
# Cargo.toml
[dependencies]
m2m-protocol = "0.4"
# With cryptographic security (AEAD, HMAC, key exchange)
m2m-protocol = { version = "0.4", features = ["crypto"] }use m2m::{CodecEngine, Algorithm};
let engine = CodecEngine::new();
// Compress
let json = r#"{"model":"gpt-4o","messages":[{"role":"user","content":"Hello"}]}"#;
let compressed = engine.compress(json, Algorithm::M2M)?;
// Decompress (auto-detects algorithm)
let original = engine.decompress(&compressed.data)?;cargo install m2m-protocol
m2m compress '{"model":"gpt-4o","messages":[...]}'
m2m decompress '#M2M|1|...'
m2m scan "Ignore all previous instructions"┌─────────────────────────────────────────────────────────────────────┐
│ M2M Protocol Stack │
├─────────────────────────────────────────────────────────────────────┤
│ Application │ Your Agent Code │
├─────────────────┼───────────────────────────────────────────────────┤
│ Security │ SecurityScanner → Cognitive threat detection │
├─────────────────┼───────────────────────────────────────────────────┤
│ Codec │ CodecEngine → M2M / TokenNative / Brotli │
├─────────────────┼───────────────────────────────────────────────────┤
│ Protocol │ Session → HELLO/ACCEPT/DATA/CLOSE │
├─────────────────┼───────────────────────────────────────────────────┤
│ Transport │ TCP (HTTP/1.1) │ QUIC (HTTP/3, 0-RTT) │
└─────────────────┴───────────────────────────────────────────────────┘
| Primitive | Purpose | Usage |
|---|---|---|
CodecEngine |
Compress/decompress payloads | engine.compress(json, Algorithm::M2M) |
Session |
Stateful connection with capability negotiation | Session::new(capabilities) |
SecurityScanner |
Semantic threat detection | scanner.scan(content) |
Algorithm |
Compression algorithm selection | M2M, TokenNative, Brotli |
M2MFrame |
Wire format with routing headers | 20-byte fixed header + payload |
Capabilities |
Protocol negotiation | Algorithms, security, streaming |
Stateless: Direct compress/decompress. No handshake, no state.
let engine = CodecEngine::new();
let compressed = engine.compress(json, Algorithm::M2M)?;Session-based: HELLO/ACCEPT capability negotiation, PING/PONG keep-alive, graceful CLOSE.
let mut session = Session::new(Capabilities::default());
session.connect(&mut transport)?; // HELLO/ACCEPT
session.send(json)?; // DATA
session.close()?; // CLOSEM2M's wire format exposes routing metadata without decompressing the payload:
#M2M|1|<header><payload>
│
└─ Model, provider, token count readable here
Payload stays compressed
| Capability | Without M2M | With M2M |
|---|---|---|
| Route by model/provider | Decompress → Parse → Route | Read header → Route |
| Cost attribution | Parse every payload | Read token count from header |
| Traffic analytics | Full decompression pipeline | Header inspection only |
| Audit logging | Store raw or lose visibility | Compressed + inspectable |
Infrastructure-layer intelligence: Load balancers, API gateways, and observability tools can make routing decisions without parsing JSON or decompressing payloads.
Traditional security operates at the network layer. Cognitive Security operates at the semantic layer:
┌─────────────────────────────────────────────────────────────────┐
│ SECURITY LAYERS │
├─────────────────────────────────────────────────────────────────┤
│ Network Security │ TLS, firewalls, IP rules │
│ (can't see content) │ "Is this connection allowed?" │
├──────────────────────┼──────────────────────────────────────────┤
│ Cognitive Security │ Semantic analysis, intent detection │
│ (understands meaning)│ "Is this agent trying to jailbreak?" │
└──────────────────────┴──────────────────────────────────────────┘
use m2m::SecurityScanner;
let scanner = SecurityScanner::new().with_blocking(0.8);
let result = scanner.scan("Ignore all previous instructions")?;
if !result.safe {
// Blocked at protocol level — never reaches downstream agent
println!("Threats: {:?}", result.threats);
}| Threat | Detection Method |
|---|---|
| Prompt injection | Semantic pattern analysis |
| Jailbreak attempts | DAN/developer mode detection |
| Data exfiltration | Environment/path pattern matching |
| Malformed payloads | Encoding attack detection |
Optional (--features crypto):
| Feature | Algorithm | Purpose |
|---|---|---|
| HMAC | SHA-256 | Message authentication |
| AEAD | ChaCha20-Poly1305 | Authenticated encryption |
| Key Exchange | X25519 | Ephemeral key agreement |
| Key Derivation | HKDF-SHA256 | Hierarchical key derivation |
Security as a protocol guarantee: Every M2M-speaking agent gets the same threat detection and crypto primitives. No per-agent implementation. No gaps.
| Algorithm | Wire Format | Compression | Best For |
|---|---|---|---|
| M2M (default) | #M2M|1|<header><payload> |
40-70% | LLM API JSON, routing-aware |
| TokenNative | #TK|<enc>|<tokens> |
30-50% | Token ID transmission |
| Brotli | #M2M[v3.0]|DATA:<b64> |
60-80% | Large payloads (>1KB) |
// Automatic (recommended)
let result = engine.compress_auto(json)?;
// Explicit
let result = engine.compress(json, Algorithm::M2M)?;
// ML-assisted (requires Hydra model)
let result = engine.compress_with_hydra(json)?;| Approach | Wire Size | Tokens | Encode | Decode | Headers Readable |
|---|---|---|---|---|---|
| Raw JSON | 100% | 100% | 0 | 0 | Yes |
| Gzip + Base64 | ~52% | +38% | ~0.5ms | ~0.3ms | No |
| Brotli + Base64 | ~40% | +25% | ~2ms | ~0.5ms | No |
| Protobuf | ~50% | N/A | ~0.2ms | ~0.2ms | No |
| M2M | ~45% | -40% | 0.24ms | 0.15ms | Yes |
M2M optimizes for the metrics that matter in agent-to-agent communication:
- Sub-millisecond latency: Encode + decode < 0.5ms total
- Token reduction: Fewer tokens = faster LLM processing
- Routing without decompression: Headers always readable
- Agent-to-agent communication over HTTP/QUIC
- LLM API traffic where token count affects latency
- Systems requiring payload inspection at infrastructure layer
- Multi-agent architectures needing standardized security
- Non-LLM traffic (use gzip/brotli directly)
- Already using efficient binary protocols end-to-end (gRPC, Cap'n Proto)
- Single-agent systems with no inter-agent communication
- Environments where payload inspection isn't needed
| Metric | Value |
|---|---|
| Compression | ~0.24ms |
| Decompression | ~0.15ms |
| Security scan | ~0.20ms |
| Throughput | 4,000+ req/sec |
Hydra is an ML classifier for intelligent algorithm selection. Native Rust inference — no ONNX/Python runtime required.
make model-downloadVersion 0.4.0 — 268 tests passing.
| Feature | Status |
|---|---|
| M2M Wire Format v1 | Stable |
| Agentic Observability | Stable |
| Cognitive Security | Stable |
| HMAC/AEAD crypto | Stable |
| Hydra ML routing | Stable |
| QUIC/HTTP3 | Experimental |
Apache-2.0 — INFERNET