-
Notifications
You must be signed in to change notification settings - Fork 582
Add Docker image documentation and workflow #225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add documentation for pre-built Docker images including usage examples and configuration guidance. Include GitHub Actions workflow for automated Docker image publishing. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
tadasant
approved these changes
Aug 7, 2025
domdomegg
added a commit
that referenced
this pull request
Aug 8, 2025
Adds a CI action to build docker images and publish them to GHCR on every commit. We can then use these when deploying the registry. This turned out to be an upstream blocker of building the infra for deployment, and something like this is needed for either deployment approach we're exploring. --- ## Summary - Add comprehensive documentation for pre-built Docker images in README - Include usage examples and configuration guidance - Add GitHub Actions workflow for automated Docker image publishing 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>
domdomegg
added a commit
that referenced
this pull request
Aug 11, 2025
) Original PR: #227 - Add Pulumi-based infrastructure as code for deploying MCP Registry to Kubernetes - Support for both local development (minikube) and Azure Kubernetes Service (AKS) - Complete deployment orchestration including: - cluster setup: e.g. you point this at an Azure account, and it can set up and manage the cluster for you. e.g. K8s version, number of nodes, type of nodes, ... - cloud agnostic K8s services: cert-manager, nginx-ingress - app services: MongoDB, and registry application (currently using nginx as a placeholder, blocked on #225 (as is #190). but should be a 1 line change) ## How is this different to #190 - Supports cluster setup and management. This enables: - Non-hosting maintainers managing many devops workflows (e.g. scaling up the cluster, or bumping K8s versions). Without this, we'd need to bug/page the organisation hosting the registry when we need these things changed. - Makes it easy to spin up things like staging/temporary clusters, as well as enables contributors to replicate the stack exactly on their own Azure accounts. - Sets up cloud-agnostic services. For example, rather than using the Azure-managed ingresses and CA, we install nginx-ingress and cert-manager. This enables: - Running the entire infra stack can also run locally (e.g. in minikube, k3s, orbstack, colima) - making it much easier for contributors to test changes to infra stuff. - Moving between cloud providers much more easily, e.g. we could shift from Azure to GCP/AWS/other with minimal hassle. - Everything stays written in Go, rather than Helm templates. This means we get things like type-checking etc. for free (which from my experience makes AI tools wayyy better at editing K8s stuff), and contributors don't need to learn a new language if they're already using Go. ## Testing I've got this running well: - locally in minikube - on cloud in Azure (my personal Azure account) <details><summary>Claude written architecture and security review</summary> <p> ## Deployment Review & Assessment ### Current Architecture Strengths **Pulumi IaC Approach** - Well-structured infrastructure as code using Pulumi - Multi-provider support (AKS, local) with clean abstraction - Good separation of concerns in `pkg/` directory **Security Fundamentals** - Non-root container execution (`appuser` with UID 10001) - Secrets properly managed via Kubernetes secrets - TLS/SSL certificate management with cert-manager and Let's Encrypt ### Critical Issues & High-Priority Improvements **1. Production Deployment Not Ready** 🚨 The registry deployment uses `nginx:alpine` placeholder image instead of the actual MCP registry: - `deploy/pkg/k8s/registry.go:67` - TODO comments indicate incomplete setup - Health probes are commented out - Port mapping doesn't match actual application (80 vs 8080) **Fix:** Build and publish actual registry container image to GHCR, update deployment **2. Database Security Considerations** 🔒 - MongoDB deployed without authentication - No backup/disaster recovery strategy - Database credentials hardcoded *Note: MongoDB is not exposed externally (ClusterIP service), so this is not a critical security risk but should be addressed for production.* **3. Monitoring & Observability Gaps** 📊 - No Prometheus/Grafana monitoring stack - No log aggregation (ELK/Loki) - No application metrics/health dashboards - No alerting configured **4. High Availability & Reliability**⚠️ - Single MongoDB instance (no replication) - No persistent volume backup strategy - Fixed 10Gi storage without growth planning - Only 2 replicas for registry service - No pod disruption budgets - No horizontal pod autoscaling ### Recommended Improvements **Immediate (High Priority)** 1. Complete Registry Deployment - Build proper container image pipeline, enable health checks 2. Secure MongoDB - Add authentication credentials, implement backup strategy **Medium Priority** 3. Add Monitoring Stack - Prometheus, Grafana deployment 4. Security Hardening (Nice to Have) - RBAC policies, Network Policies, Pod Security Standards 5. CI/CD Pipeline Enhancement - Container image building/publishing, automated deployment **Lower Priority** 6. High Availability - MongoDB replica set, HPA for registry pods 7. Operational Excellence - Kubernetes dashboard, cost optimization ### Configuration Issues - Production config has test credentials: `deploy/Pulumi.prod.yaml:4-5` - Missing environment-specific resource sizing - Hardcoded domain names (`example.com`) The deployment setup shows good architectural foundations but needs significant work before production readiness. The most critical issue is the placeholder nginx container - priority should be completing the actual registry application deployment before addressing the other improvements. Security measures like RBAC and Network Policies are nice to have but not strictly necessary given that MongoDB is not exposed externally. 🤖 Generated with [Claude Code](https://claude.ai/code) </p> </details> ## Metadata Working towards #91 --------- Co-authored-by: Claude <noreply@anthropic.com>
9 tasks
domdomegg
added a commit
that referenced
this pull request
Aug 12, 2025
Adds the Pulumi code to: - Deploy the registry (and associated services e.g. mongodb) to Google Cloud Platform (GCP), on top of Google Kubernetes Engine (GKE) - Sets up proper environments and secrets management - Uses the real container image, now that it's published in #225. At the moment attached to latest, we might want to pin the version later (or perhaps always use `latest` in staging, and pin prod) - Uses real domains (`staging.registry.modelcontextprotocol.io`) rather than examples (``) ## Motivation and Context Setting up infrastructure to deploy it. I set something up in Azure in #227, although not super robust (e.g. no service accounts etc.). Think we will use GCP as: - the maintainers have experience with GCP, but none with Azure - costs are quite low, and Anthropic is happy to cover them in the short term - means we only have to maintain one login system (just Google Cloud Identity), not two (Google Workspace + Azure) ## How Has This Been Tested? Deployed this to a staging and production cluster. Try it yourself at: ```bash curl -H "Host: staging.registry.modelcontextprotocol.io" -k https://35.222.36.75/v0/ping ``` (will be sorting out domains very soon) ## Breaking Changes NA - just adds support for GCP deployment ## Types of changes <!-- What types of changes does your code introduce? Put an `x` in all the boxes that apply: --> - [ ] Bug fix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update ## Checklist <!-- Go over all the following points, and put an `x` in all the boxes that apply. --> - [x] I have read the [MCP Documentation](https://modelcontextprotocol.io) - [x] My code follows the repository's style guidelines - [ ] New and existing tests pass locally - [x] I have added appropriate error handling - [x] I have added or updated documentation as needed ## Additional context <!-- Add any other context, implementation notes, or design decisions --> Expected follow-ups: - GitHub Action setup to deploy things to the cluster from GitHub, to avoid gatekeeping to just the people with the secrets.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds a CI action to build docker images and publish them to GHCR on every commit. We can then use these when deploying the registry.
This turned out to be an upstream blocker of building the infra for deployment, and something like this is needed for either deployment approach we're exploring.
Summary
🤖 Generated with Claude Code