Skip to content

0xDAEF0F/mcp-rust-docs-embed

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mcp-rust-docs-embed

An MCP (Model Context Protocol) server that provides semantic search capabilities for Rust crate documentation through vector embeddings.

Overview

This MCP server enables AI assistants to search and understand Rust crate documentation by:

  • Generating JSON-formatted documentation using cargo doc
  • Creating vector embeddings of documentation content using OpenAI
  • Storing embeddings in Qdrant for efficient semantic search
  • Providing MCP tools for embedding and querying documentation

Features

  • Automatic Documentation Generation: Builds Rust documentation in JSON format using cargo's nightly toolchain
  • Semantic Search: Query documentation using natural language through vector embeddings
  • Version Management: Each crate version is stored separately for precise version-specific searches
  • Feature Support: Intelligently selects compatible features to avoid mutually exclusive conflicts
  • Async Operations: Long-running embedding tasks are handled asynchronously with status tracking

Prerequisites

  • Rust nightly toolchain (for JSON documentation output)
  • Qdrant vector database instance
  • OpenAI API key for embeddings

Installation

# Clone the repository
git clone https://github.com/yourusername/mcp-rust-docs-embed
cd mcp-rust-docs-embed

# Build the project
cargo build --release

Configuration

Create a .env file with the following variables:

# Required
QDRANT_URL=http://localhost:6334
OPENAI_API_KEY=your_openai_api_key

# Optional
QDRANT_API_KEY=your_qdrant_api_key
PORT=8080  # Default: 8080

Usage

Starting the MCP Server

cargo run

The server will start on http://127.0.0.1:8080/sse (or your configured port).

Using with Claude Code or MCP Clients

To use this server with Claude Code or other MCP clients, add it to your MCP configuration:

Production Server

{
  "rust-docs": {
    "type": "sse",
    "url": "https://mcp-rust-docs-embed-production.up.railway.app/sse"
  }
}

Local Development

For local development, replace the URL with your local server:

{
  "rust-docs": {
    "type": "sse",
    "url": "http://127.0.0.1:8080/sse"
  }
}

Available MCP Tools

1. embed_docs

Generate and embed documentation for a Rust crate.

Parameters:

  • crate_name (required): Name of the crate to document
  • version (optional): Version to embed (defaults to latest)
  • features (optional): List of features to enable

Example:

{
  "crate_name": "tokio",
  "version": "1.40.0",
  "features": ["full"]
}

2. query_embeddings

Search embedded documentation using natural language.

Parameters:

  • query (required): Natural language search query
  • crate_name (required): Crate to search in
  • version (optional): Version to search (defaults to latest)
  • limit (optional): Number of results (default: 10)

Example:

{
  "query": "how to create an async tcp server",
  "crate_name": "tokio",
  "limit": 5
}

3. check_embed_status

Check the status of an ongoing embedding operation.

Parameters:

  • operation_id (required): ID returned by embed_docs

4. list_embedded_crates

List all crates and versions that have been embedded.

Architecture

Documentation Pipeline

  1. Documentation Generation (docs_builder.rs)

    • Creates a temporary Cargo project
    • Adds target crate as dependency with intelligently selected features:
      • Uses "full" feature if available (typically includes all compatible features)
      • Falls back to "default" feature if "full" doesn't exist
      • Uses no features if neither exists (to avoid mutually exclusive conflicts)
    • Runs cargo +nightly doc --output-format=json
  2. JSON Processing (doc_loader.rs, json_types.rs)

    • Parses Rust's JSON documentation format
    • Extracts structured information about items, functions, types, etc.
  3. Embedding Generation (documentation.rs)

    • Chunks documentation content appropriately
    • Generates embeddings using OpenAI's text-embedding-3-small model
    • Batches embeddings for efficiency
  4. Storage (data_store.rs)

    • Stores embeddings in Qdrant collections
    • Collection naming: {crate_name}_v{version} (normalized)
    • Includes source content for retrieval

Query System

The QueryService (query.rs) handles:

  • Converting queries to embeddings
  • Searching Qdrant for similar documentation
  • Returning relevant documentation snippets with similarity scores

Development

Key Dependencies

  • rmcp - Rust MCP SDK
  • qdrant-client - Vector database client
  • cargo - Programmatic cargo operations
  • async-openai - OpenAI API client
  • axum - Web framework for SSE server
  • tokio - Async runtime

Limitations

  • Requires Rust nightly for JSON documentation output
  • OpenAI API costs for embedding generation
  • Storage requirements grow with number of embedded crates
  • May not document all features if a crate has mutually exclusive features (defaults to "full" or "default" feature)

About

sse mcp server that embeds rust docs (with chunking) and allows for querying them from any mcp client

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages