Speakeasy Docs MCP

A lightweight, domain-agnostic hybrid search engine for markdown (.md) corpora, exposed via the Model Context Protocol (MCP). While it can index and serve any markdown corpus, it is deeply optimized for serving SDK documentation to AI coding agents. Beta.

Features

Hybrid search — full-text, phrase proximity, and vector similarity blended via Reciprocal Rank Fusion
Distributed manifests — per-directory .docs-mcp.json files configure chunking strategy, metadata, and taxonomy independently per subtree
Faceted taxonomy — metadata keys become enum-injected JSON Schema filters on the search tool
Vector collapse — deduplicates near-identical cross-language results at search time
Incremental builds — embedding cache fingerprints each chunk; only changed content is re-embedded
Graceful degradation
- Chunking — chunk sizes adapt to the configured embedding provider's context window; falls back to conservative defaults when no provider is set
- Query — if the embedding API errors at runtime (downtime, expired credits, network issues), the server falls back to FTS-only search with a one-time warning

How It Works

Docs MCP provides a local, in-memory search engine (powered by LanceDB) that runs inside a Node.js MCP server. Three core optimizations make it effective for structured documentation:

Faceted Taxonomy

Metadata keys defined in .docs-mcp.json manifests become enum-injected JSON Schema parameters on the search_docs tool. The agent selects from a strict set of valid filter values (e.g. language: ["typescript", "python", "go"]). On zero results, the server returns structured hints (e.g. "0 results for 'typescript'. Matches found in: ['python']").

Vector Collapse

SDK documentation for the same API operation across multiple languages produces near-identical embeddings. Vector collapse deduplicates these at search time, keeping only the highest-scoring variant per taxonomy field:

{ "taxonomy": { "language": { "vector_collapse": true } } }

When the agent explicitly filters by language, collapse is automatically skipped — the filter already restricts to a single variant.

Hybrid FTS + Semantic Search

Search combines three ranking signals via Reciprocal Rank Fusion:

Full-text search — multi-field matching on headings (boosted 3x) and content
Phrase proximity — rewards results where query terms appear close together
Vector similarity — semantic embedding distance (when an embedding provider is configured)

FTS dominates for exact class names and error codes. Vector similarity lifts conceptual and paraphrased queries. The blend is configurable via RRF weights.

Hierarchical Context

Ancestor headings (breadcrumbs like Auth SDK > AcmeAuthClientV2 > Initialization) are prepended to each chunk's embedding input and returned with search results. This enables the calling agent to explore the corpus structure, navigating from high-level concepts down to specific implementation details.

Benchmarks

On a realistic ~300-operation API with hand-written guides (~28.8MB corpus, 5 eval categories), benchmarked with docs-mcp-eval benchmark:

Summary

Metric	none	openai	Takeaway
MRR@5	0.2141	0.2833	Embeddings lift relevant-result ranking by 32%
NDCG@5	0.2536	0.3218	Graded relevance improves 27% with embeddings
Facet Precision	0.3750	0.4375	Embeddings improve filter accuracy by 17%
Search p50 (ms)	5.2	258.4	FTS-only is ~50x faster at median
Search p95 (ms)	6.5	11101.1	Tail latency dominated by embedding API
Build Time (ms)	6022	1569703	Embedding uses batch API for large corpora
Peak RSS (MB)	221.1	283.8	Modest memory overhead
Index Size (corpus 28.8MB)	104.9MB	356.9MB	Vectors ~3.4x the FTS-only index
Embed Cost (est.)	$0	$0.9825	~$1 one-time cost per corpus
Query Cost (est.)	$0	$0.000003	Negligible per-query cost

Per-Category MRR@5

MRR@5 (Mean Reciprocal Rank at 5) measures how high the first relevant result appears in the top 5. 1.0 = always ranked first; 0.0 = never appears in top 5.

Category	none	openai	Takeaway
clarification	0.3000	0.3000	FTS matches embeddings
cross-service	0.1667	0.3333	Embeddings double rank
exact-name	0.3625	0.3792	FTS nearly matches embeddings
natural-language	0.0731	0.1692	Embeddings lift 130%
workflow	0.3333	0.4444	Embeddings lift 33%

Recommendation

We recommend starting with FTS-only search. While embeddings improve relevance for conceptual and paraphrased queries, they also introduce ~50x query latency and substantial build overhead. For agents that iterate through multiple searches, the faster cycle time of pure FTS has anecdotally proven more valuable than the per-query relevance lift — particularly with modern models capable of query refinement.

Graceful Fallback

No embeddings (--embedding-provider none): FTS-only search, zero cost, zero API keys. Already effective for exact-match and lexical queries.
With embeddings (--embedding-provider openai): Hybrid search with better recall on conceptual and paraphrased queries. ~$1 one-time embedding cost per 28.8MB corpus.
Runtime degradation: If the embedding API is unavailable at query time, the server automatically falls back to FTS-only with a one-time warning.

Supported File Types

The indexer processes .md (Markdown) files. Files are discovered via the **/*.md glob pattern within the configured docs directory. YAML frontmatter is supported for per-file metadata and chunking overrides.

Corpus Structure

Folder Layout

Documentation corpora use .docs-mcp.json manifests to control chunking and taxonomy. Manifests can be placed at any level of the directory tree:

my-docs/
├── .docs-mcp.json              ← root manifest (applies to guides/)
├── guides/
│   ├── retries.md
│   └── pagination.md
└── sdks/
    ├── typescript/
    │   ├── .docs-mcp.json      ← deeper manifest (exclusive precedence)
    │   └── auth.md
    └── python/
        ├── .docs-mcp.json      ← deeper manifest (exclusive precedence)
        └── auth.md

Deeper manifests take exclusive precedence. A file at sdks/typescript/auth.md is governed only by sdks/typescript/.docs-mcp.json — the root manifest is ignored for that subtree.

`.docs-mcp.json`

{
  // Required. Schema version.
  "version": "1",

  // Chunking strategy applied to all files in this directory tree.
  "strategy": {
    "chunk_by": "h2",         // Split at ## headings. Options: h1, h2, h3, file
    "max_chunk_size": 8000,   // Oversized chunks split recursively at finer headings
    "min_chunk_size": 200     // Tiny trailing chunks merge into preceding chunk
  },

  // Key-value pairs attached to every chunk. Each key becomes a filterable
  // enum parameter on the search_docs tool.
  "metadata": {
    "language": "typescript",
    "scope": "sdk-specific"
  },

  // Per-field search behavior. vector_collapse deduplicates cross-language
  // variants at search time (only active when no filter is set for that field).
  "taxonomy": {
    "language": { "vector_collapse": true }
  },

  // File-pattern overrides. Evaluated top-to-bottom; last match wins.
  // Override metadata merges with root (override keys win).
  // Override strategy replaces root strategy entirely.
  "overrides": [
    {
      "pattern": "models/**/*.md",
      "strategy": { "chunk_by": "file" }
    }
  ]
}

Full schema: schemas/docs-mcp.schema.json

Individual files can also override their manifest via YAML frontmatter (mcp_chunking_hint, metadata keys). Frontmatter takes highest precedence. See the manifest contract for full resolution rules.

Architecture

Structured as a Turborepo with four packages:

Package	Role
`@speakeasy-api/docs-mcp-cli`	CLI for validation, manifest bootstrap (`fix`), and deterministic indexing (`build`)
`@speakeasy-api/docs-mcp-core`	Core retrieval primitives, AST parsing, chunking, and LanceDB queries
`@speakeasy-api/docs-mcp-server`	Lean runtime MCP server surface
`@speakeasy-api/docs-mcp-eval`	Standalone evaluation and benchmarking harness

                +---------------------------+
                |     Agent / MCP Host      |
                +-------------+-------------+
                              |
                              | Dynamic Tool Schema (with Enums)
                              v
                +---------------------------+
                | @speakeasy-api/           |
                |   docs-mcp-server         |
                | search_docs, get_doc      |
                +-------------+-------------+
                              |
                              v
                +---------------------------+
                | @speakeasy-api/           |
                |   docs-mcp-core           |
                | LanceDB Engine            |
                | Memory-Mapped IO          |
                +-------------+-------------+
                              |
                              v
                     +-----------------+
                     | .lancedb/ index |
                     +-----------------+

MCP Tools

The tools exposed to the agent are dynamically generated based on your corpus_description and indexed metadata.

Tool	What it does
`search_docs`	Performs hybrid search. Tool names and descriptions are user-configurable. Parameters are dynamically generated with valid taxonomy injected as JSON Schema `enum`s. Supports stateless cursor pagination. Returns fallback hints on zero results.
`get_doc`	Returns a specific chunk, plus `context: N` neighboring chunks for surrounding detail.

Quick Start

FTS-Only (Recommended)

No API keys required. Zero-config full-text search:

# --- build stage ---
FROM node:22-slim AS build
RUN npm install -g @speakeasy-api/docs-mcp-cli
ARG DOCS_DIR=docs
COPY ${DOCS_DIR} /corpus
RUN docs-mcp build --docs-dir /corpus --out /index --embedding-provider none

# --- runtime stage ---
FROM node:22-slim
RUN npm install -g @speakeasy-api/docs-mcp-server
COPY --from=build /index /index
EXPOSE 20310
CMD ["docs-mcp-server", "--index-dir", "/index", "--transport", "http", "--port", "20310"]

docker build --build-arg DOCS_DIR=./docs -t docs-mcp .
docker run -p 20310:20310 docs-mcp

With Embeddings (Optional)

For hybrid FTS + semantic search, add an OpenAI embedding provider. This improves recall on conceptual and paraphrased queries at the cost of higher latency and ~$1 one-time embedding cost per 28.8MB corpus.

# --- build stage ---
FROM node:22-slim AS build
RUN npm install -g @speakeasy-api/docs-mcp-cli
ARG DOCS_DIR=docs
COPY ${DOCS_DIR} /corpus
RUN --mount=type=secret,id=OPENAI_API_KEY \
    OPENAI_API_KEY=$(cat /run/secrets/OPENAI_API_KEY) \
    docs-mcp build --docs-dir /corpus --out /index --embedding-provider openai

# --- runtime stage ---
FROM node:22-slim
RUN npm install -g @speakeasy-api/docs-mcp-server
COPY --from=build /index /index
EXPOSE 20310
CMD ["docs-mcp-server", "--index-dir", "/index", "--transport", "http", "--port", "20310"]

docker build --secret id=OPENAI_API_KEY,env=OPENAI_API_KEY \
  --build-arg DOCS_DIR=./docs -t docs-mcp .
docker run -p 20310:20310 -e OPENAI_API_KEY docs-mcp

The build secret embeds the corpus; the runtime -e OPENAI_API_KEY lets the server embed search queries.

Docker Compose (Server + Playground)

To get a server and interactive playground running together:

# docker-compose.yml
services:
  server:
    build:
      context: .
      # Uses the Dockerfile above (FTS-only or embeddings variant)
    ports:
      - "20310:20310"
    # Uncomment for embeddings:
    # environment:
    #   - OPENAI_API_KEY=${OPENAI_API_KEY}

  playground:
    image: node:22-slim
    command: >
      sh -c "npm install -g @speakeasy-api/docs-mcp-playground &&
             npx @speakeasy-api/docs-mcp-playground"
    ports:
      - "3001:3001"
    environment:
      - MCP_TARGET=http://server:20310
    depends_on:
      - server

docker compose up

Open http://localhost:3001 to explore the index interactively.

Transport Options

The MCP server supports two transport modes:

Flag	Transport	Default	Use case
`--transport stdio`	Standard I/O	Yes (default)	Direct MCP client integration (e.g. Claude Desktop, Cursor)
`--transport http`	Streamable HTTP		Containerized deployments, playground, multi-client access

When using HTTP transport, the server listens on port 20310 by default (configurable with --port).

stdio example (MCP client config):

npx @speakeasy-api/docs-mcp-server --index-dir ./dist/.lancedb

HTTP example:

npx @speakeasy-api/docs-mcp-server --index-dir ./dist/.lancedb --transport http --port 20310

Usage & Deployment

1. Authoring (Local Dev)

The fix command scans all .md files in your docs directory, analyzes their heading structure (h1/h2/h3 frequency), and generates a .docs-mcp.json manifest with the best-fit chunking strategy per file. The most common strategy becomes the default; files that differ get pattern-based overrides.

npx @speakeasy-api/docs-mcp-cli fix --docs-dir ./docs

2. Indexing (CI Build Step) Run the deterministic indexer against your corpus. The indexer reads manifests and frontmatter to chunk the docs, generates embeddings, and saves the local .lancedb directory. Cache the output directory across CI runs to make builds incremental — only changed chunks are re-embedded.

- uses: actions/cache@v4
  with:
    path: ./dist/.lancedb
    # Unique key saves the updated cache after each build
    key: docs-mcp-${{ github.run_id }}
    # Prefix match loads the most recent prior cache
    restore-keys: docs-mcp-

- run: npx @speakeasy-api/docs-mcp-cli build --docs-dir ./docs --out ./dist/.lancedb

3. Runtime (MCP Server) The .lancedb directory is packaged with the MCP server. FTS search is fully local. If the index was built with embeddings, the server calls the embedding API at query time to embed the search query.

npx @speakeasy-api/docs-mcp-server --index-dir ./dist/.lancedb

4. Playground (Optional)

Explore the index interactively in a browser. The playground connects to a running HTTP server and provides a search UI.

npx @speakeasy-api/docs-mcp-playground

Open http://localhost:3001. Requires a running HTTP server (step 3 with --transport http).

Environment Variable	Description	Default
`MCP_TARGET`	HTTP endpoint of the MCP server	`http://localhost:20310`
`PORT`	Playground server port	`3001`
`SERVER_NAME`	Display name shown in the playground UI	`speakeasy-docs`
`PLAYGROUND_PASSWORD`	Password-protect the playground (hashed via SHA256)	(none — open access)
`GRAM_API_KEY`	Enables chat mode when set	(none — chat disabled)

Evaluation

Docs MCP includes a standalone evaluation harness for measuring search quality with transparent, repeatable benchmarks. See the Evaluation Framework for how to build an eval suite, run benchmarks across embedding providers, and interpret results.

License

AGPL-3.0

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.changeset		.changeset
.github/workflows		.github/workflows
docs		docs
packages		packages
schemas		schemas
scripts		scripts
tests/fixtures/docs		tests/fixtures/docs
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
eslint.config.mjs		eslint.config.mjs
mise.toml		mise.toml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
skills-lock.json		skills-lock.json
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speakeasy Docs MCP

Features

How It Works

Faceted Taxonomy

Vector Collapse

Hybrid FTS + Semantic Search

Hierarchical Context

Benchmarks

Summary

Per-Category MRR@5

Recommendation

Graceful Fallback

Supported File Types

Corpus Structure

Folder Layout

`.docs-mcp.json`

Architecture

MCP Tools

Quick Start

FTS-Only (Recommended)

With Embeddings (Optional)

Docker Compose (Server + Playground)

Transport Options

Usage & Deployment

Evaluation

License

About

Uh oh!

Releases 5

Packages

Contributors 2

Uh oh!

Languages

License

speakeasy-api/docs-mcp

Folders and files

Latest commit

History

Repository files navigation

Speakeasy Docs MCP

Features

How It Works

Faceted Taxonomy

Vector Collapse

Hybrid FTS + Semantic Search

Hierarchical Context

Benchmarks

Summary

Per-Category MRR@5

Recommendation

Graceful Fallback

Supported File Types

Corpus Structure

Folder Layout

.docs-mcp.json

Architecture

MCP Tools

Quick Start

FTS-Only (Recommended)

With Embeddings (Optional)

Docker Compose (Server + Playground)

Transport Options

Usage & Deployment

Evaluation

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Contributors 2

Uh oh!

Languages

`.docs-mcp.json`

Packages