Skip to content

Add docs: design principles, FAQ, roadmap, architecture #134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jun 27, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,19 @@
# Official Registry Documentation

## Project Documentation

[`design_principles.md`](./design_principles.md) - Core constraints and principles guiding the registry design

[`faq.md`](./faq.md) - Frequently asked questions about the MCP Registry

[`roadmap.md`](./roadmap.md) - High-level roadmap for the MCP Registry development

[`MCP Developers Summit 2025 - Registry Talk Slides.pdf`](./MCP%20Developers%20Summit%202025%20-%20Registry%20Talk%20Slides.pdf) - Slides from a talk given at the MCP Developers Summit on May 23, 2025, with an up-to-date vision of how we are thinking about the official registry.

## API & Technical Specifications

[`openapi.yaml`](./openapi.yaml) - OpenAPI specification for the official registry API

[`api_examples.md`](./api_examples.md) - Examples of what data will actually look like coming from the official registry API
[`MCP Developers Summit 2025 - Registry Talk Slides.pdf`](./MCP%20Developers%20Summit%202025%20-%20Registry%20Talk%20Slides.pdf) - Slides from a talk given at the MCP Developers Summit on May 23, 2025, with an up-to-date vision of how we are thinking about the official registry.

[`architecture.md`](./architecture.md) - Technical architecture, deployment strategies, and data flows
207 changes: 207 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,207 @@
# MCP Registry Architecture

This document describes the technical architecture of the MCP Registry, including system components, deployment strategies, and data flows.

## System Overview

The MCP Registry is designed as a lightweight metadata service that bridges MCP server creators with consumers (MCP clients and aggregators).

```mermaid
graph TB
subgraph "Server Maintainers"
CLI[CLI Tool]
end

subgraph "MCP Registry"
API[REST API<br/>Go]
DB[(MongoDB or PostgreSQL)]
CDN[CDN Cache]
end

subgraph "Intermediaries"
MKT[Marketplaces]
AGG[Aggregators]
end

subgraph "End Consumers"
MC[MCP Client Host Apps<br/>e.g. Claude Desktop]
end

subgraph "External Services"
NPM[npm Registry]
PYPI[PyPI Registry]
DOCKER[Docker Hub]
DNS[DNS Services]
GH[GitHub OAuth]
end

CLI --> |Publish| API
API --> DB
API --> CDN
CDN --> |Daily ETL| MKT
CDN --> |Daily ETL| AGG
MKT --> MC
AGG --> MC
API -.-> |Auth| GH
API -.-> |Verify| DNS
API -.-> |Reference| NPM
API -.-> |Reference| PYPI
API -.-> |Reference| DOCKER
```

## Core Components

### REST API (Go)

The main application server implemented in Go, providing:
- Public read endpoints for server discovery
- Authenticated write endpoints for server publication
- GitHub OAuth integration (extensible to other providers)
- DNS verification system (optional for custom namespaces)

### Database (MongoDB or PostgreSQL)

Primary data store for:
- Versioned server metadata (server.json contents)
- User authentication state
- DNS verification records

### CDN Layer

Critical for scalability:
- Caches all public read endpoints
- Reduces load on origin servers
- Enables global distribution
- Designed for daily consumer polling patterns

### CLI Tool

Developer interface for:
- Server publication workflow
- GitHub OAuth flow
- DNS verification

## Deployment Architecture

### Kubernetes Deployment (Helm)

The registry is designed to run on Kubernetes using Helm charts:

```mermaid
graph TB
subgraph "Kubernetes Cluster"
subgraph "Namespace: mcp-registry"
subgraph "Registry Service"
LB[Load Balancer<br/>:80]
RS[Registry Service<br/>:8080]
RP1[Registry Pod 1]
RP2[Registry Pod 2]
RP3[Registry Pod N]
end

subgraph "Database Service"
DBS[DB Service<br/>:27017]
SS[StatefulSet]
PV[Persistent Volume]
end

subgraph "Secrets"
GHS[GitHub OAuth Secret]
end
end
end

LB --> RS
RS --> RP1
RS --> RP2
RS --> RP3
RP1 --> DBS
RP2 --> DBS
RP3 --> DBS
DBS --> SS
SS --> PV
RP1 -.-> GHS
RP2 -.-> GHS
RP3 -.-> GHS
```

## Data Flow Patterns

### 1. Server Publication Flow

```mermaid
sequenceDiagram
participant Dev as Developer
participant CLI as CLI Tool
participant API as Registry API
participant DB as Database
participant GH as GitHub
participant DNS as DNS Provider

Dev->>CLI: mcp publish server.json
CLI->>CLI: Validate server.json
CLI->>GH: OAuth flow
GH-->>CLI: Access token
CLI->>API: POST /servers
API->>GH: Verify token
API->>DNS: Verify domain (if applicable)
API->>DB: Store metadata
API-->>CLI: Success
CLI-->>Dev: Published!
```

### 2. Consumer Discovery Flow

```mermaid
sequenceDiagram
participant Client as MCP Client Host App
participant INT as Intermediary<br/>(Marketplace/Aggregator)
participant CDN as CDN Cache
participant API as Registry API
participant DB as Database

Note over INT,CDN: Daily ETL Process
INT->>CDN: GET /servers
alt Cache Hit
CDN-->>INT: Cached response
else Cache Miss
CDN->>API: GET /servers
API->>DB: Query servers
DB-->>API: Server list
API-->>CDN: Response + cache headers
CDN-->>INT: Response
end
INT->>INT: Process & enhance data
INT->>INT: Store in local cache

Note over Client,INT: Real-time Client Access
Client->>INT: Request server list
INT-->>Client: Curated/enhanced data
```

### 3. DNS Verification Flow

```mermaid
sequenceDiagram
participant User as User
participant CLI as CLI Tool
participant API as Registry API
participant DNS as DNS Provider
participant DB as Database

User->>CLI: mcp verify-domain example.com
CLI->>API: POST /verify-domain
API->>API: Generate verification token
API->>DB: Store pending verification
API-->>CLI: TXT record: mcp-verify=abc123
CLI-->>User: Add TXT record to DNS
User->>DNS: Configure TXT record
User->>CLI: Confirm added
CLI->>API: POST /verify-domain/check
API->>DNS: Query TXT records
DNS-->>API: TXT records
API->>API: Validate token
API->>DB: Store verification
API-->>CLI: Domain verified
CLI-->>User: Success!
```
39 changes: 39 additions & 0 deletions docs/design_principles.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# MCP Registry Design Principles

These are the core constraints that guide the design of the MCP Registry. They are not exhaustive, but they are the most important principles that we will use to evaluate design decisions.

## 1. Single Source of Truth

The registry serves as the authoritative metadata repository for publicly-available MCP servers, both locally-run and remote, open source and closed source. Server creators publish once, and all consumers (MCP clients, aggregators, etc.) reference the same canonical data.

## 2. Minimal Operational Burden

- Design for low maintenance and operational overhead
- Delegate complexity to existing services where possible (GitHub for auth, npm/PyPI for packages)
- Avoid features that require constant human intervention or moderation
- Build for reasonable downtime tolerance (24h acceptable) by having consumers cache data for their end-users
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to encourage caching of data but I'm hoping we can have a better SLA than 24 hours of downtime since publishers won't be able to publish new servers during that time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My biggest concern about sub-24h is that anything under 24h necessitates on on-call schedule. If we frame it as 24h, then we can rely on volunteers coming online once a day.

I know it's not great for productivity if publication is offline for 24h, but maybe we start with a 24h SLA and once we get a sense for adoption + other maintenance needs, we could then seek to create a permanent operational position that is funded/sponsored by someone?


## 3. Vendor Neutrality

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No interest in blocking anyone from building and publishing a server - but say we released an in-service MCP server for OneDrive, how would that be differentiated from the 100 other implementation that have OneDrive in the name/description/access OD files? Substitute OneDrive for GDrive or any other service. This is the app store problem of a bunch of very similar things being very confusing for users.

Say I am a consumer, I know nothing of tech, and I search for "OneDrive" in MCP registry and get 100 results - how would I make a choice?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that amounts to a curation problem, which would be the purview of aggregators rather than the registry itself. We do plan to track download counts (#95), so perhaps aggregators could sort by those numbers. Aggregators can also implement their own ratings systems to recommend the most popular (and presumably official) servers. And I suppose paid placement is another option. 💀

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I am not on the registry team, just focused on it because I'm lighting up NuGet based MCP servers)
Agreed that this mostly goes to the clients that read the data and "project" the data into whatever UX they want.
Additionally, the server name is a reverse DNS name so the official OneDrive MCP server could perhaps have a name like com.microsoft.onedrive/mcp or something like that. This would mean that the owner of microsoft.com is publishing this OneDrive MCP server. I am not sure when non-GitHub package names will come in. Maybe for the first release per #100.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then from the registry standpoint the plan is to externalize the risk.

It would be worth thinking through the implications of that approach, where without some indicator of official releases the consumer (think non-tech person - what's DNS? what's a github package?) is potentially placing their content at risk.

Anyone should be able to build/publish servers, while also having a way to mark official releases would be nice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyone should be able to build/publish servers, while also having a way to mark official releases would be nice.

How would the registry determine which packages are official and which are not? How would disputes be handled?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

npm has the organization model, which isn't flawless but groups things in a user understandable way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say I am a consumer, I know nothing of tech, and I search for "OneDrive" in MCP registry and get 100 results - how would I make a choice? ... (think non-tech person - what's DNS? what's a github package?) is potentially placing their content at risk.

I don't expect consumers to use the Registry API. They should only interface with MCP clients and their Registry mirrors, which they are responsible for augmenting and curating to the level of sophistication that matches their target persona. I don't think we can solve trust at the right level of granularity for all MCP uses through to the end-consumers, so the only scaleable way to approach this is to federate out to the clients, who own the end-user UI's and can present them as they see fit (after augmenting them with their own notions of trust and curation/filtering).


- No preferential treatment for specific servers or organizations
- No built-in ranking, curation, or quality judgments
- Let consumers (MCP clients, aggregators) make their own curation decisions

## 4. Meets Industry Security Standards

- Leverage existing package registries (npm, PyPI, Docker Hub, etc.) for source code distribution, obviating the need to reinvent source code security
- Use mechanisms like DNS verification, OAuth to provide base layer of authentication and trust
- Implement rate limiting, field validation, and blacklisting to prevent abuse

## 6. Reusable, Extensible Shapes; Not Infrastructure

- API shapes (OpenAPI, server.json) designed for reuse
- Enable private/internal registries using same formats
- Don't mandate infrastructure reuse - focus on interface compatibility

## 7. Progressive Enhancement

- Start with MVP that provides immediate value
- Build foundation that supports future features
- Don't over-engineer for hypothetical needs
- Each milestone should be independently valuable
Loading
Loading