Skip to content

A reverse-engineered Rust library implementing OpenAI's Harmony response format for structured conversational AI interactions.

Notifications You must be signed in to change notification settings

terraprompt/harmony-protocol

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Harmony Protocol

A reverse-engineered Rust library implementing OpenAI's Harmony response format for structured conversational AI interactions.

⚠️ IMPORTANT: This Library Requires an OpenAI Model

This library does NOT include an AI model. It provides the conversation formatting and parsing layer that works with OpenAI's models that understand the Harmony format. You still need:

  • OpenAI API access or compatible model
  • A model that understands Harmony formatting (<|start|>, <|message|>, <|end|> tokens)
  • Integration code to send formatted tokens to the model and receive responses

What this library does: Formats conversations → [Your OpenAI Model] → Parses responses

Overview

This library provides a complete implementation of the Harmony response format used by OpenAI's open-weight model series (gpt-oss). It enables parsing and rendering of structured conversations with support for:

  • Multiple communication channels (analysis, commentary, final)
  • Tool calling and function integration
  • Reasoning effort control
  • Streaming token parsing
  • System and developer instructions

Key Features

🚀 High Performance

  • Rust-based core with minimal overhead
  • Thread-local regex optimization
  • Efficient tokenization with BPE encoding
  • Memory-efficient streaming parser

🔧 Flexible Architecture

  • Support for multiple encoding configurations
  • Extensible tool system with namespaces
  • Configurable channel routing
  • Role-based message validation

🌐 Multi-Platform Support

  • Native Rust library
  • Python bindings (PyO3) with full API compatibility
  • WebAssembly support with interactive demo
  • Cross-platform vocabulary download and caching

📊 Production Ready

  • Comprehensive test suite (13 tests passing)
  • Performance benchmarks for all operations
  • Graceful error handling and network failure recovery
  • 4 detailed examples with documentation
  • Thread-safe concurrent processing

Quick Start

Installation

Add to your Cargo.toml:

[dependencies]
harmony-protocol = { git = "https://github.com/yourusername/harmony-protocol" }

Basic Usage

use harmony_protocol::{
    load_harmony_encoding, HarmonyEncodingName,
    chat::{Role, Message, Conversation, SystemContent}
};

fn main() -> anyhow::Result<()> {
    // Load the encoding
    let enc = load_harmony_encoding(HarmonyEncodingName::HarmonyGptOss)?;

    // Create a conversation
    let convo = Conversation::from_messages([
        Message::from_role_and_content(
            Role::System,
            SystemContent::new()
                .with_required_channels(["analysis", "commentary", "final"])
        ),
        Message::from_role_and_content(Role::User, "Hello, world!"),
    ]);

    // Render for completion (ready to send to OpenAI model)
    let input_tokens = enc.render_conversation_for_completion(&convo, Role::Assistant, None)?;
    println!("Generated {} tokens ready for OpenAI model", input_tokens.len());

    // TODO: Send input_tokens to your OpenAI model and get response_tokens
    // let response_tokens = your_openai_client.complete(input_tokens).await?;

    // Parse the model's response back to structured messages
    // let messages = enc.parse_messages_from_completion_tokens(response_tokens, Some(Role::Assistant))?;

    Ok(())
}

With Tool Support

use harmony_protocol::chat::{
    SystemContent, ToolDescription, ToolNamespaceConfig, Message, Role
};

fn main() -> anyhow::Result<()> {
    let tools = vec![
        ToolDescription::new(
            "calculate",
            "Performs mathematical calculations",
            Some(serde_json::json!({
                "type": "object",
                "properties": {
                    "expression": {"type": "string"}
                },
                "required": ["expression"]
            }))
        )
    ];

    let function_namespace = ToolNamespaceConfig::new("functions", None, tools);

    let system_content = SystemContent::new()
        .with_browser_tool()
        .with_tools(function_namespace);

    let message = Message::from_role_and_content(Role::System, system_content);
    Ok(())
}

Streaming Parser

use harmony_protocol::{StreamableParser, load_harmony_encoding, HarmonyEncodingName};
use harmony_protocol::chat::Role;

fn main() -> anyhow::Result<()> {
    let encoding = load_harmony_encoding(HarmonyEncodingName::HarmonyGptOss)?;
    let mut parser = StreamableParser::new(encoding.clone(), Some(Role::Assistant))?;

    // In practice, response_tokens would come from your OpenAI model's streaming API
    let response_tokens = vec![200006, 1234, 5678]; // These would be from OpenAI

    // Process tokens as they arrive from the model
    for token in response_tokens {
        parser.process(token)?;

        // Get content delta for real-time streaming UI updates
        if let Ok(Some(delta)) = parser.last_content_delta() {
            print!("{}", delta); // Show new content to user immediately
        }
    }

    // Get final structured messages after streaming is complete
    let messages = parser.into_messages();
    println!("\nParsed {} messages from model output", messages.len());
    Ok(())
}

Message Format

The Harmony format structures conversations using special tokens:

<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06

Reasoning: medium

# Valid channels: analysis, commentary, final. Channel must be included for every message.<|end|>
<|start|>user<|message|>What is 2 + 2?<|end|>
<|start|>assistant<|channel|>analysis<|message|>I need to perform a simple arithmetic calculation.<|end|>
<|start|>assistant<|channel|>final<|message|>2 + 2 equals 4.<|end|>

Channel System

The library supports multiple communication channels for organized model outputs:

  • analysis: Internal reasoning and analysis
  • commentary: Model explanations and meta-commentary
  • final: User-facing final responses

Channels can be configured as required, and the system automatically handles analysis dropping when final responses are complete.

Tool Integration

Built-in Tool Namespaces

  1. Browser Tools: Web browsing, search, and content extraction
  2. Python Tools: Code execution environment
  3. Function Tools: Custom function definitions

Custom Tools

use harmony_protocol::chat::ToolDescription;

fn main() {
    let custom_tool = ToolDescription::new(
        "weather",
        "Gets current weather for a location",
        Some(serde_json::json!({
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location"]
        }))
    );

    println!("Created custom tool: {}", custom_tool.name);
}

Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Chat Module   │    │ Encoding Module │    │ Registry Module │
│                 │    │                 │    │                 │
│ • Message       │◄──►│ • Rendering     │◄──►│ • Configurations│
│ • Conversation  │    │ • Parsing       │    │ • Token Mappings│
│ • Content Types │    │ • Streaming     │    │ • Vocab Loading │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         ▲                       ▲
         │                       │
         ▼                       ▼
┌─────────────────┐    ┌─────────────────┐
│ Tiktoken Module │    │    Extensions   │
│                 │    │                 │
│ • BPE Encoding  │    │ • Public Vocabs │
│ • Tokenization  │    │ • Hash Verify   │
│ • Thread Safety │    │ • Remote Loading│
└─────────────────┘    └─────────────────┘

Special Tokens

Token ID Purpose
`< start >`
`< message >`
`< end >`
`< channel >`
`< call >`
`< return >`
`< constrain >`

Configuration

Environment Variables

  • TIKTOKEN_ENCODINGS_BASE: Custom vocabulary file directory
  • TIKTOKEN_RS_CACHE_DIR: Custom cache directory

Features

  • python-binding: Enable PyO3 Python bindings
  • wasm-binding: Enable WebAssembly support

Performance

  • Context Window: 1,048,576 tokens (1M)
  • Max Action Length: 524,288 tokens (512K)
  • Thread-Safe: Optimized for concurrent access
  • Memory Efficient: Token reuse and streaming parsing

Testing

# Run all tests (13 tests covering unit + integration)
cargo test

# Run performance benchmarks
cargo bench

# Run specific examples
cargo run --example basic_usage
cargo run --example tool_integration
cargo run --example streaming_parser
cargo run --example channel_management

The test suite includes comprehensive validation against canonical examples and edge cases.

Examples

The library includes 4 comprehensive examples:

  1. basic_usage.rs - Message creation and conversation rendering
  2. tool_integration.rs - Custom tools and function calling
  3. streaming_parser.rs - Real-time token processing
  4. channel_management.rs - Multi-channel workflows

See examples/README.md for detailed usage instructions.

Python Bindings

cd python
python setup.py build_rust
pip install -e .
import harmony_protocol as hr

# Same API as Rust, but in Python
encoding = hr.load_harmony_encoding(hr.HarmonyEncodingName.harmony_gpt_oss())
conversation = hr.Conversation.from_messages([
    hr.Message.from_role_and_content(hr.Role.user(), "Hello!")
])
tokens = encoding.render_conversation(conversation)

WebAssembly Demo

cd www
npm run build  # Requires wasm-pack
npm run serve  # Open http://localhost:8000

Performance

Run benchmarks to see performance characteristics:

cargo bench

Results show the library can handle:

  • Large conversations: 1000+ messages efficiently
  • Real-time streaming: Process tokens as they arrive from model
  • Concurrent access: Thread-safe for multiple conversations

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit a pull request

Documentation

License

This project is licensed under the Apache License 2.0.

Disclaimer

This is a reverse-engineered implementation for educational and research purposes. It is not affiliated with or endorsed by OpenAI.

About

A reverse-engineered Rust library implementing OpenAI's Harmony response format for structured conversational AI interactions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published