Skip to content

lproux/Azure-AI-Gateway-Easy-Deploy

Repository files navigation

AI Gateway & Azure OpenAI Integration Repository

Comprehensive collection of Azure AI Gateway implementations, Azure OpenAI with API Management patterns, and AI development resources. This repository consolidates production-ready patterns for enterprise AI deployment on Azure.

Quick Start

Open in GitHub Codespaces Dev Container Python 3.12+ Azure CLI

Choose Your Development Environment

Cloud-Based (Recommended for Quick Start)

  • GitHub Codespaces: Click the badge above to launch a complete cloud environment with all dependencies pre-installed
  • No local setup required
  • Ready in 3-5 minutes

Local Development

  • VS Code Dev Container: Clone repo β†’ Open in VS Code β†’ Reopen in Container
  • Requires Docker Desktop
  • Consistent, isolated environment

Manual Setup

  • Install Python 3.12+, Azure CLI, and dependencies
  • Full control, works offline

See TESTING.md for detailed setup instructions.

Repository Overview

This repository contains three major implementation areas, each providing different approaches to building enterprise-grade AI solutions on Azure:

1. AI-Gateway - Azure AI Gateway Labs & Samples

πŸ“‚ Location: AI-Gateway/

Official Azure samples demonstrating Microsoft Copilot plugin interoperability and comprehensive AI Gateway patterns using Azure API Management.

Original Repository: Azure-Samples/AI-Gateway

What's Inside:

  • πŸ§ͺ 30+ Hands-on Labs covering AI agents, MCP integration, function calling, and production patterns
  • 🌟 Master Lab - Deploy once, explore 7 comprehensive labs in a single notebook experience
  • πŸ” Security - OAuth 2.0, JWT validation, managed identities
  • ⚑ Performance - Semantic caching, load balancing, multi-region deployments
  • πŸ’° Cost Management - Token limiting, FinOps framework, chargeback models
  • πŸ€– AI Agents - OpenAI Agents, Model Context Protocol (MCP), Azure AI Agent Service
  • πŸ“Š Observability - Built-in logging, token metrics, compliance monitoring

2. AzureOpenAI-with-APIM - Enterprise-Grade APIM Integration

πŸ“‚ Location: AzureOpenAI-with-APIM/

Production-ready reference implementation for managing Azure OpenAI through API Management, focusing on enterprise governance, cost control, and operational excellence.

Original Repository: microsoft/AzureOpenAI-with-APIM

What's Inside:

  • πŸš€ One-Click Deployment - Deploy APIM, Key Vault, and Log Analytics with auto-configuration
  • πŸ”„ Resiliency - Multi-region retry policies, automatic failover
  • πŸ“ˆ Scalability - Load balancing across multiple Azure OpenAI endpoints
  • πŸ’Ž Performance - Provisioned Throughput Units (PTU), priority-based routing
  • πŸ’΅ Cost Management - Token rate limiting, chargeback models, Power BI reporting
  • πŸ”’ Security - Managed identities, private endpoints, zero-trust architecture
  • πŸ“Š Monitoring - Log Analytics integration, KQL queries, usage tracking

3. ai-for-developers-main - AI Development Resources

πŸ“‚ Location: ai-for-developers-main/

Documentation, best practices, and guidance for building secure AI applications with MCP and GitHub Copilot integration.


πŸš€ Getting Started

Quick Deploy Options

Option 1: Master Lab (Recommended for Learning)

Deploy a comprehensive environment with 7 labs in a single Jupyter notebook:

cd AI-Gateway/labs/master-lab

# Deploy with Azure Developer CLI
az login
azd up

What gets deployed:

  • API Management (StandardV2)
  • 3 AI Foundry Hubs (multi-region)
  • 7 AI Models (GPT-4o, GPT-4, DALL-E-3, embeddings)
  • Redis Enterprise (semantic caching)
  • Azure AI Search (vector search)
  • Cosmos DB (message storage)
  • 7 MCP servers (Container Apps)
  • Log Analytics + Application Insights

Time: 35-40 minutes Cost: ~$890-1,190/month (varies by usage)

πŸ“– Master Lab Documentation

Option 2: APIM-Focused Deployment (Production-Ready)

Deploy enterprise-grade APIM for Azure OpenAI management:

cd AzureOpenAI-with-APIM

# One-button deploy via Azure Portal
# Click: Deploy to Azure button in README

# Or via Azure CLI
az login
az group create --name RG-APIM-OpenAI --location eastus
az deployment group create \
  --resource-group RG-APIM-OpenAI \
  --template-file public-apim.bicep

What gets deployed:

  • API Management
  • Key Vault
  • Log Analytics
  • Auto-configuration for Azure OpenAI
  • Token monitoring policies
  • Cost management policies

Time: 45 minutes Cost: ~$175/month base + usage

πŸ“– APIM Deployment Guide

Option 3: Individual Labs (Modular Learning)

Explore specific capabilities with targeted labs:

cd AI-Gateway/labs

# Choose your lab:
# - access-controlling/      # OAuth 2.0 & JWT validation
# - backend-pool-load-balancing/  # Multi-region load balancing
# - semantic-caching/        # Redis-backed intelligent caching
# - model-context-protocol/  # MCP integration
# - openai-agents/          # AI agent orchestration
# - finops-framework/        # Cost management
# - built-in-logging/        # Token metrics & monitoring

# Each lab has its own deployment
cd <lab-name>
az login
az deployment group create \
  --resource-group <your-rg> \
  --template-file main.bicep

πŸ“– Individual Labs Index


🎯 Alternative Deployment Paths

Fast-Track Options for Different Use Cases

Option 4: Easy Deploy Notebook ⚑ (Fastest - Supports "Run All")

Perfect for: Quick demonstrations, proof-of-concepts, GitHub Codespaces

The streamlined version of the Master Lab designed for zero-touch "Run All" execution:

# In GitHub Codespaces or VS Code:
cd AI-Gateway/labs/master-lab

# 1. Login to Azure (required once)
az login --use-device-code

# 2. Open notebook and click "Run All"
# Open: master-ai-gateway-easy-deploy.ipynb
# Just click "Run All" - fully automated!

Key Features:

  • "Run All" Ready - No manual cell execution needed
  • 34 cells (vs 152 in full Master Lab, 78% reduction)
  • Auto-RBAC Setup - Assigns Cosmos DB and other permissions automatically
  • Auto-retrieves APIM Keys - No manual key copying required
  • Same infrastructure as Master Lab
  • Time: 35-40 minutes deployment
  • Documentation: Easy Deploy Guide | Quick Start

Option 5: Quick Start Modular Labs 🧩 (Learn by Topic)

Perfect for: Learning specific features, running independent labs, minimal setup

Run individual 10-minute labs after one-time infrastructure setup:

cd AI-Gateway/labs/master-lab/quick_start

# One-time setup (run once)
# Open: 00-quick-init.ipynb

# Then run any lab independently:
# 01-access-control.ipynb       (~10 min)
# 02-semantic-caching.ipynb     (~10 min)
# 03-message-storing.ipynb      (~10 min)
# 04-vector-searching.ipynb     (~10 min)
# 05-model-routing.ipynb        (~10 min)
# 06-built-in-logging.ipynb     (~10 min)
# 07-finops-framework.ipynb     (~10 min)

Key Features:

  • Shared initialization (shared_init.py) - no code duplication
  • Independent labs - run any lab in any order
  • Quick iterations - ~10 minutes per lab
  • Same infrastructure - uses Master Lab deployment
  • Documentation: Quick Start Guide

Comparison:

Feature Master Lab Easy Deploy Quick Start Modular APIM-Focused Individual Labs
Setup Complexity High Low Minimal Medium Low
Cells/Steps 152 34 10-15 per lab N/A Varies
Run All Support Manual Yes Yes N/A Varies
Best For Comprehensive learning Quick setup Topic-specific Production Single feature
Time Investment 3-4 hours 1 hour 10 min/lab 2 hours 30-60 min
Flexibility All features All features Pick & choose Production focus Focused
Infrastructure Full stack Full stack Full stack APIM-centric Minimal

πŸ” Authentication Options

Choose the authentication method that fits your use case:

Option 1: Azure CLI Authentication (Recommended for Development)

Simple and secure - uses your Azure account directly:

# Login to Azure
az login --use-device-code  # For Codespaces/remote environments
az login                    # For local development with browser

# Run notebooks - they use DefaultAzureCredential automatically

Pros: No secrets to manage, automatic token refresh, works everywhere Best for: Development, learning, Codespaces, local testing

Option 2: APIM Subscription Keys (Simple API Access)

Use API keys for straightforward API access:

# Keys are auto-retrieved by easy-deploy notebook
# Or manually get from Azure Portal:
# APIM > Subscriptions > master > Show/Hide keys

# Set in environment or .env file:
APIM_SUBSCRIPTION_KEY=your-key-here

Pros: Simple to use, no Azure AD required for API calls Best for: Quick testing, external integrations, CI/CD pipelines

Option 3: Managed Identity (Production Recommended)

Zero-secret authentication for Azure-hosted applications:

from azure.identity import ManagedIdentityCredential, get_bearer_token_provider
from openai import AzureOpenAI

credential = ManagedIdentityCredential()
client = AzureOpenAI(
    azure_endpoint=endpoint,
    azure_ad_token_provider=get_bearer_token_provider(
        credential,
        "https://cognitiveservices.azure.com/.default"
    ),
    api_version="2024-10-21"
)

Pros: No secrets, automatic rotation, audit trail Best for: Production deployments, Container Apps, Azure Functions

Authentication Comparison

Method Security Setup Effort Use Case
Azure CLI (az login) High Low Development, Codespaces
APIM Subscription Keys Medium Low Quick testing, external clients
Managed Identity Highest Medium Production workloads
Service Principal High Medium CI/CD, automation

πŸ“š Key Features by Area

AI-Gateway Labs

AI Agents & MCP

  • Model Context Protocol integration
  • OpenAI Agents SDK
  • Azure AI Agent Service
  • Function calling patterns
  • Multi-tool orchestration

Production Patterns

  • Load balancing (multi-region)
  • Semantic caching (5-20x faster)
  • Token rate limiting
  • Content safety
  • Built-in logging

Security & Compliance

  • OAuth 2.0 authentication
  • JWT token validation
  • Managed identities
  • Message storage (Cosmos DB)
  • Access control policies

AzureOpenAI-with-APIM Features

Enterprise Governance

  • Subscription-based access control
  • Priority-based routing
  • Circuit breaker patterns
  • Private endpoint support
  • Managed identity authentication

Cost & Operations

  • Token usage tracking per subscription
  • Chargeback models & reporting
  • Power BI dashboard integration
  • Log Analytics queries (KQL)
  • Alert automation with Logic Apps

πŸš€ Advanced Features by Category

Cutting-Edge AI Capabilities

Real-Time APIs πŸŽ™οΈ

  • Realtime Audio - Real-time audio streaming with Azure OpenAI
  • Realtime MCP Agents - Combined audio + tool calling
  • WebSocket support for streaming
  • Low-latency voice interactions

AI Agent Services πŸ€–

Multi-Cloud & Model Diversity

Azure OpenAI

  • GPT-4o, GPT-4, GPT-3.5
  • DALL-E 3 (image generation)
  • Embeddings (text-embedding-ada-002)
  • Multi-region deployments

AI Foundry Models

Third-Party Models

Enterprise & Production Patterns

Zero-Trust Security πŸ”’

Production Deployment 🏭

DevOps & Automation

MCP Server Management:

Infrastructure as Code:

  • Bicep templates (primary)
  • Terraform variants (alternative)
  • Azure Developer CLI (azd) integration
  • CI/CD pipeline examples

πŸ—οΈ Architecture Patterns

Master Lab Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚    API Management (StandardV2)          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ β€’ Access Control (OAuth/JWT)      β”‚  β”‚
β”‚  β”‚ β€’ Semantic Caching (Redis)        β”‚  β”‚
β”‚  β”‚ β€’ Message Storing (Cosmos DB)     β”‚  β”‚
β”‚  β”‚ β€’ Load Balancing (Multi-region)   β”‚  β”‚
β”‚  β”‚ β€’ MCP Integration (7 servers)     β”‚  β”‚
β”‚  β”‚ β€’ Built-in Logging (App Insights) β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚          β”‚          β”‚
    β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β” β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β” β”Œβ”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
    β”‚ Foundry β”‚ β”‚ Foundry β”‚ β”‚ Foundry β”‚
    β”‚ UK Southβ”‚ β”‚ Sweden Cβ”‚ β”‚ West EU β”‚
    β”‚ 7 Modelsβ”‚ β”‚ 1 Model β”‚ β”‚ 1 Model β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

APIM Enterprise Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Client Applications & Services      β”‚
β”‚  (OAuth 2.0 / Managed Identity)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  Azure APIM Gateway β”‚
    β”‚  β€’ Token Limiting   β”‚
    β”‚  β€’ Load Balancing   β”‚
    β”‚  β€’ Retry Policies   β”‚
    β”‚  β€’ Logging & Metricsβ”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”
      β”‚                 β”‚
β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
β”‚ Azure AOAI β”‚   β”‚ Azure AOAI  β”‚
β”‚ Region 1   β”‚   β”‚ Region 2    β”‚
β”‚ (Primary)  β”‚   β”‚ (Failover)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“– Documentation Index

Primary Documentation

Area Documentation Description
Master Lab Master Lab README Comprehensive 7-in-1 lab experience
Individual Labs Labs Index 30+ modular labs
APIM Integration APIM Guide Enterprise APIM patterns
AI Gateway Concepts AI Gateway README Overview and concepts

πŸ“š Complete Labs Index (Categorized by Difficulty)

🟒 Beginner Labs (Start Here)

Perfect for getting started with Azure AI Gateway concepts:

Lab Focus Time Documentation
Request Forwarding Basic APIM routing 20 min Core concept
Backend Circuit Breaking Resilience patterns 25 min Error handling
Built-in Logging Observability basics 30 min Token tracking
Access Controlling OAuth 2.0 & JWT 35 min Security fundamentals
Image Generation DALL-E integration 25 min Vision APIs

🟑 Intermediate Labs (Production Patterns)

Build production-ready features and patterns:

Lab Focus Time Documentation
Semantic Caching Redis-based caching 40 min 50-80% cost reduction
Backend Pool Load Balancing Multi-region routing 45 min High availability
Advanced Load Balancing Priority routing 50 min PTU optimization
Model Routing Dynamic model selection 40 min Cost optimization
Response Streaming Streaming responses 30 min Real-time UX
Vector Searching RAG pattern 55 min AI Search integration
Message Storing Cosmos DB storage 45 min Compliance & audit
FinOps Framework Cost management 50 min Chargeback models
Private Connectivity Private endpoints 60 min Zero-trust networking

πŸ”΄ Advanced Labs (Cutting-Edge Features)

Explore latest AI capabilities and advanced patterns:

Lab Focus Time Documentation
Model Context Protocol MCP integration 60 min Tool-based AI
OpenAI Agents Agent orchestration 65 min Agentic AI patterns
AI Agent Service Azure AI service 55 min Managed agents
Realtime Audio Audio streaming 50 min Voice interactions
Realtime MCP Agents Combined real-time 70 min Advanced agents
Gemini MCP Agents Google Gemini 60 min Multi-vendor AI
AWS Bedrock Multi-cloud 65 min AWS integration
AI Foundry DeepSeek DeepSeek models 45 min Alternative models
AI Foundry SDK SDK patterns 50 min Direct integration
SLM Self-Hosting Edge deployment 75 min On-premises AI

πŸ› οΈ DevOps & Automation Labs

Infrastructure, deployment, and operational patterns:

Lab Focus Time Documentation
Zero-to-Production Complete deployment 90 min Production guide
Fragment-Based Policies Policy management 55 min Advanced APIM
MCP from API Auto-generate MCP 45 min Automation
MCP Registry (API Center) Centralized registry 50 min Governance
MCP Registry (GitHub) GitOps automation 60 min CI/CD integration
MCP Client Authorization OAuth flows 55 min Advanced security
Backend Pool (Terraform) IaC alternative 50 min Terraform
Secure Responses API Output validation 40 min Compliance

πŸ“¦ All-in-One Experiences

Comprehensive lab bundles:

Experience Includes Time Best For
Master Lab 7 core labs 3-4 hrs Comprehensive learning
Easy Deploy Streamlined setup 1 hr Quick setup
Quick Start Modular Independent labs 10 min/lab Topic learning

πŸ”§ Prerequisites

Azure Requirements

Development Tools

# Required
- Python 3.12 or later
- Azure CLI 2.50 or later
- VS Code with Jupyter extension

# Optional (for specific labs)
- Docker Desktop (for MCP server development)
- Node.js 20.x (for MCP servers)
- Azure Developer CLI (azd)

Development Environment Options

Choose the development environment that works best for your workflow:

🌐 GitHub Codespaces (Recommended for Quick Start)

Open this repository directly in a cloud-based development environment with all dependencies pre-installed:

# Click "Code" β†’ "Codespaces" β†’ "Create codespace on main"
# Or use the button:

Open in GitHub Codespaces

Pre-configured with:

  • Python 3.12+
  • Azure CLI
  • Azure Developer CLI (azd)
  • Jupyter kernel
  • All Python dependencies

Post-Launch Setup (Required):

After opening your Codespace, run the setup script to configure Azure authentication and Cosmos DB access:

# Run the automated setup script
./setup-codespace.sh

This script will:

  1. Install required Python packages
  2. Prompt for Azure login (use device code authentication)
  3. Detect your Codespace IP address
  4. Update Cosmos DB firewall (if already deployed)
  5. Add missing environment variables

Manual Setup Alternative:

# 1. Install Python dependencies
pip install --user python-dotenv azure-identity azure-mgmt-resource azure-cosmos openai requests

# 2. Login to Azure
az login --use-device-code

# 3. Set your subscription
az account set --subscription "YOUR_SUBSCRIPTION_ID"

# 4. Get your Codespace IP (for Cosmos DB firewall)
curl -s ifconfig.me

After Deployment: If you've already deployed resources, update Cosmos DB firewall:

CURRENT_IP=$(curl -s ifconfig.me)
az cosmosdb update \
  --name YOUR_COSMOS_ACCOUNT \
  --resource-group YOUR_RESOURCE_GROUP \
  --ip-range-filter "$CURRENT_IP,104.42.195.92,40.76.54.131,52.176.6.30,52.169.50.45,52.187.184.26,0.0.0.0" \
  --public-network-access Enabled

Documentation:

🐳 VS Code Dev Containers (Local Development)

Use Docker-based development containers for consistent local environments:

# Prerequisites: Docker Desktop, VS Code, Dev Containers extension
git clone https://github.com/lproux/Azure-AI-Gateway-Easy-Deploy.git
cd Azure-AI-Gateway-Easy-Deploy/AI-Gateway/labs/master-lab
code .
# VS Code will prompt to "Reopen in Container"

Benefits:

  • Isolated, reproducible environment
  • No local Python/tool installation needed
  • Same environment as Codespaces
  • Works offline

πŸ’» Local Setup (Full Control)

Install dependencies directly on your machine:

Quick Setup (Local Installation)

# Install Azure CLI
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash  # Linux
# Or: https://learn.microsoft.com/cli/azure/install-azure-cli

# Install Azure Developer CLI (azd)
curl -fsSL https://aka.ms/install-azd.sh | bash  # Linux/macOS
# Or: https://learn.microsoft.com/azure/developer/azure-developer-cli/install-azd

# Login to Azure
az login
az account set --subscription <your-subscription-id>

# Clone this repository
git clone https://github.com/lproux/MCP-servers-internalMSFT-and-external.git
cd MCP-servers-internalMSFT-and-external

πŸ’° Cost Estimation

Master Lab (Comprehensive)

Service Monthly Cost Notes
API Management StandardV2 ~$175 Core gateway
AI Foundry (3 regions) ~$0 base Usage-based
AI Model Usage ~$500-800 Varies by usage
Redis Enterprise ~$20 Caching
Azure AI Search ~$75 Vector search
Cosmos DB ~$25 Message storage
Container Apps ~$30 MCP servers
Log Analytics ~$50 Monitoring
TOTAL ~$890-1,190 Per month

APIM-Focused Deployment

Service Monthly Cost Notes
API Management ~$175 Production-ready
Key Vault ~$0 Minimal
Log Analytics ~$20 Basic monitoring
TOTAL ~$195 Base + usage

Cost Optimization Tips:

  • Use semantic caching to reduce API calls by 50-80%
  • Start with gpt-4o-mini for development (15-20x cheaper)
  • Delete resources when not in use: az group delete --name <rg-name>
  • Enable auto-scaling for APIM to reduce idle costs

🎯 Learning Paths

Beginner Path: Start with Master Lab

  1. βœ… Deploy Master Lab (Guide)

    cd AI-Gateway/labs/master-lab
    azd up
  2. βœ… Explore Labs in Order:

    • Lab 08: Access Control
    • Lab 09: Semantic Caching
    • Lab 10: Message Storing
    • Lab 11: Vector Search
  3. βœ… Review Monitoring:

    • Lab 12: Built-in Logging
    • Explore Log Analytics queries
    • Review token usage reports

Intermediate Path: Production Patterns

  1. βœ… Deploy APIM Integration (Guide)

    cd AzureOpenAI-with-APIM
    # Use Deploy to Azure button or CLI
  2. βœ… Implement Resiliency:

    • Configure multi-region backends
    • Add retry policies
    • Test failover scenarios
  3. βœ… Add Cost Management:

    • Configure token rate limiting
    • Set up chargeback reporting
    • Create Power BI dashboards

Advanced Path: Custom Integration

  1. βœ… Explore Individual Labs:

    • Choose specific labs from Labs Index
    • Customize policies for your use case
    • Integrate with existing infrastructure
  2. βœ… Build Custom MCP Servers:

    • Review MCP Integration lab
    • Deploy custom data sources
    • Create multi-tool orchestration
  3. βœ… Implement Enterprise Patterns:

    • Private endpoints
    • Custom authentication
    • Advanced monitoring & alerting

πŸ” Security Best Practices

Authentication Methods (Ordered by Security)

  1. Managed Identity (Highest Security)

    • No secrets to manage
    • Automatic credential rotation
    • Native Azure integration
    • Use for: Production deployments on Azure
  2. Service Principal with Certificate

    • Certificate-based authentication
    • Auditable access
    • Use for: Automated pipelines, CI/CD
  3. Service Principal with Client Secret

    • Explicit credential management
    • Time-limited secrets
    • Use for: Development, testing
  4. API Keys / Subscription Keys

    • Simple but less secure
    • Manual key rotation
    • Use for: Initial testing only

Security Checklist

  • Enable managed identities for all service-to-service communication
  • Use private endpoints for production deployments
  • Implement JWT token validation for client authentication
  • Enable Azure AD OAuth 2.0 for user authentication
  • Configure Content Safety policies for input validation
  • Store secrets in Azure Key Vault
  • Enable diagnostic logging for audit trails
  • Implement rate limiting and throttling
  • Use Azure DDoS Protection for public endpoints
  • Enable Azure Defender for Cloud

πŸ› οΈ Common Operations

Deploy Complete Environment

# Option 1: Master Lab (azd)
cd AI-Gateway/labs/master-lab
az login
azd up --environment production

# Option 2: APIM-focused (Bicep)
cd AzureOpenAI-with-APIM
az login
az group create --name RG-APIM-OpenAI --location eastus
az deployment group create \
  --resource-group RG-APIM-OpenAI \
  --template-file public-apim.bicep \
  --parameters @parameters.json

# Option 3: Individual Lab
cd AI-Gateway/labs/semantic-caching
az login
az deployment group create \
  --resource-group my-rg \
  --template-file main.bicep

Monitor Deployments

# Check deployment status
az deployment group show \
  --name <deployment-name> \
  --resource-group <rg-name> \
  --query properties.provisioningState

# View all resources
az resource list \
  --resource-group <rg-name> \
  --output table

# Stream deployment logs
az deployment group list \
  --resource-group <rg-name> \
  --output table

Cleanup Resources

# Delete entire resource group (CAUTION: Irreversible!)
az group delete \
  --name <rg-name> \
  --yes \
  --no-wait

# Verify deletion
az group show --name <rg-name>
# Should return: (ResourceGroupNotFound)

# Delete specific resource
az resource delete \
  --resource-group <rg-name> \
  --name <resource-name> \
  --resource-type <type>

Query Logs & Metrics

# Token usage by subscription (KQL)
az monitor log-analytics query \
  --workspace <workspace-id> \
  --analytics-query "
    customMetrics
    | where name in ('Prompt Tokens', 'Completion Tokens')
    | summarize TotalTokens = sum(value) by tostring(customDimensions['Subscription ID'])
  "

# Recent API calls
az apim api operation list \
  --resource-group <rg-name> \
  --service-name <apim-name> \
  --api-id azure-openai-api

# View Application Insights metrics
az monitor app-insights metrics show \
  --app <app-insights-name> \
  --metric requests/count \
  --aggregation count

πŸ”— External Resources

Official Documentation

Original Source Repositories

AI & MCP Concepts

Community & Support


🀝 Contributing

We welcome contributions to all areas of this repository!

How to Contribute

  1. Fork the repository

  2. Choose your area:

  3. Make your changes:

    git checkout -b feature/your-feature
    # Make changes
    git commit -m "Add: Your feature description"
    git push origin feature/your-feature
  4. Create Pull Request with:

    • Clear description of changes
    • Testing performed
    • Documentation updates
    • Screenshots (if applicable)

Contribution Areas

  • ✨ New lab implementations
  • πŸ“ Documentation improvements
  • πŸ› Bug fixes
  • 🎨 Bicep/ARM template enhancements
  • πŸ“Š Monitoring & analytics examples
  • πŸ”’ Security pattern improvements
  • πŸ’° Cost optimization guides

πŸ“œ License

This repository consolidates content from multiple sources:


πŸ™ Acknowledgments

This repository builds upon the outstanding work of multiple teams and contributors from Microsoft and the Azure community. We extend our sincere gratitude to:

Original Repository Owners & Contributors

Azure-Samples/AI-Gateway

  • Created and maintained by the Microsoft Azure Samples Team
  • Special thanks to all contributors who developed the comprehensive lab experiences, MCP integration patterns, and production-ready templates
  • This repository forms the foundation of the AI Gateway patterns and the Master Lab experience

microsoft/AzureOpenAI-with-APIM

  • Created and maintained by the Microsoft Azure API Management Team
  • Special thanks to the contributors who built the enterprise-grade APIM integration patterns, cost management frameworks, and resiliency implementations
  • This repository provides the production-ready APIM reference architecture

Azure Product Teams

We thank the following Microsoft Azure teams whose products and documentation made this work possible:

  • Azure API Management Team - For the robust gateway service and comprehensive documentation
  • Azure OpenAI Service Team - For democratizing access to cutting-edge AI models
  • Azure AI Foundry Team - For the unified AI development platform
  • Azure AI Search Team - For powerful vector search capabilities
  • Azure Cache for Redis Team - For enabling high-performance semantic caching
  • Azure Cosmos DB Team - For globally distributed database services
  • Azure Container Apps Team - For simplifying MCP server deployments
  • Azure Monitor Team - For comprehensive observability tools

Community Contributors

This consolidated repository benefits from the collective knowledge and feedback of the Azure AI community. Thank you to everyone who:

  • Reported issues and provided feedback
  • Contributed code improvements and bug fixes
  • Shared deployment experiences and best practices
  • Created tutorials and educational content

Open Source Foundations

We acknowledge the broader ecosystem that makes this work possible:

  • Model Context Protocol (MCP) - For standardizing AI data source integration
  • OpenAI - For pioneering AI models and APIs
  • Open Source Community - For the countless tools, libraries, and frameworks that power modern AI development

Note: This repository is a consolidation and enhancement of existing open-source projects. All original work retains its respective licenses and attributions. We strive to properly credit all sources and welcome corrections or additions to these acknowledgments.


πŸŽ“ Support

Getting Help

  1. Documentation: Check the relevant README in each folder
  2. Azure Service Health: Azure Status
  3. GitHub Issues:
  4. Azure Support: Azure Support Portal

Troubleshooting Quick Reference

Issue Solution
Authentication failed Run az login --use-device-code (Codespaces) or az login
Quota exceeded Request increase in Azure Portal > Quotas
Deployment timeout APIM takes 30-45 min (normal), check az deployment group show
Module not found Reinstall: pip install --user -r requirements.txt then restart kernel
MCP server errors Check Container Apps logs: az containerapp logs show
Cosmos DB Forbidden Add Codespace IP to firewall: ./setup-codespace.sh
MCP tool not calling Add tool_choice="required" to force tool calling
LOG_ANALYTICS_CUSTOMER_ID missing Run ./setup-codespace.sh or see Codespaces Setup Guide

🌟 What Makes This Repository Special

βœ… Three Comprehensive Approaches - Master Lab, APIM-focused, Individual Labs βœ… Production-Ready - Battle-tested patterns used in enterprise deployments βœ… Well-Documented - Extensive README files, inline comments, architecture diagrams βœ… Cost-Conscious - Built-in cost tracking, optimization tips, transparent pricing βœ… Security-First - Managed identities, private endpoints, OAuth 2.0 βœ… Modular Design - Use what you need, when you need it βœ… Active Maintenance - Regularly updated with latest Azure features


πŸ“Š Repository Statistics

  • Total Labs: 30+ individual labs + Master Lab
  • Deployment Time: 35-40 minutes (Master Lab)
  • Lines of Documentation: ~15,000+
  • Azure Services: 15+ services integrated
  • Authentication Methods: 3 (Managed Identity, Service Principal, API Keys)
  • Multi-Region Support: βœ… Load balancing, failover, high availability

Quick Start Summary

# 1️⃣ Clone Repository
git clone https://github.com/lproux/MCP-servers-internalMSFT-and-external.git
cd MCP-servers-internalMSFT-and-external

# 2️⃣ Choose Your Path

# Master Lab (Recommended for Learning)
cd AI-Gateway/labs/master-lab
az login && azd up

# APIM-Focused (Production-Ready)
cd AzureOpenAI-with-APIM
# Use "Deploy to Azure" button OR:
az deployment group create --resource-group <rg> --template-file public-apim.bicep

# Individual Lab (Modular)
cd AI-Gateway/labs/semantic-caching
az deployment group create --resource-group <rg> --template-file main.bicep

# 3️⃣ Explore & Learn
# Open notebooks, review policies, test deployments

# 4️⃣ Cleanup (When Done)
az group delete --name <rg-name> --yes --no-wait

πŸš€ Ready to build enterprise-grade AI solutions on Azure!

Last Updated: 2025-12-02 Version: 2.2.0 Maintained by: LP Roux

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •