Skip to content

Conversation

@ehrnst
Copy link
Contributor

@ehrnst ehrnst commented Oct 28, 2025

Description

The PR is a small re-work of #1183 which I could not get to work since it was using azidentity.NewManagedIdentityCredential(managedIDOptions) which would leverage the underlying AKS node's identity. I added azidentity.NewWorkloadIdentityCredential(workloadIDOptions) that allows pods to run with a specific Azure identity, federated with the kubernetes cluster.

Related Issues/PRs (if applicable)

Related to PR #1183 (Azure Managed Identity support by @mattmancel)

Special notes for reviewers (if applicable)

  • CRD Changes: The CRD file was manually updated to match API changes due to controller-gen access issues on Windows. you may want to regenerate the CRD to verify
  • I tested this from all angles I could think of on a clean AKS cluster. Installing AI Gateway with the latest helm chart, patching the deployments with my local build. Then i tested annotating the helm install like this 👇 before making requests against the GW.
  --version 0.0.0-2861c1b5b182b278841eec609d0f2b497b150ce9 \
  --set controller.serviceAccount.annotations."azure\.workload\.identity/client-id"="d245ed76-8227-" \
  --set controller.serviceAccount.annotations."azure\.workload\.identity/tenant-id"="e5806cb5-f8f4-" \
  --set controller.image.repository="ehrnst.azurecr.io/ai-gateway-controller" \
  --set controller.image.tag="azure-managed-identity-v3"

I dont know if we should change to specifically creating a service account, or if we should continue with the patch approach during install or after.

  • I noticed that the controller will create a secret (witht the bearer token) which is used by the extproc container during calls to Azure OpenAI. I did not inspect how this rotation work, but i did remove my identity and delete the secret to verify authentication agains azure stopped working.

MattMencel and others added 6 commits September 11, 2025 22:18
This commit adds comprehensive support for Azure Managed Identity authentication
for AI Gateway backends through the BackendSecurityPolicy.

Changes include:
- New AzureManagedIdentityTokenProvider with system and user-assigned identity support
- Support for OIDC token exchange and Kubernetes secret-based authentication
- Comprehensive test coverage including integration tests
- Example configurations and CRD validation test data
- Updated API documentation and CRD schemas
- Added *.test pattern to .gitignore to exclude Go test binaries

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Matt Mencel <matt@techminer.net>
Remove omitempty from ClientID field in BackendSecurityPolicyAzureCredentials
to maintain consistency with other ID fields and avoid breaking existing
integrations that expect the field to always be present in JSON output.

This addresses the GitHub Copilot review comment in PR envoyproxy#1183.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Matt Mencel <matt@techminer.net>
@ehrnst ehrnst requested a review from a team as a code owner October 28, 2025 15:57
@ehrnst ehrnst changed the title Feature: Azure/AKS workload identity support Feat: Azure/AKS workload identity support Oct 28, 2025
@ehrnst ehrnst changed the title Feat: Azure/AKS workload identity support feat: Azure/AKS workload identity support Oct 28, 2025
@ehrnst ehrnst changed the title feat: Azure/AKS workload identity support feat: azure/aks workload identity support Oct 28, 2025
Copy link
Member

@mathetake mathetake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work! However, I have a suggestion that would drastically change the current code.

Right now, the impl is using the SA mounted by Envoy AI Gateway controller, and it distributes/copies that identity into the secret that is eventually mounted by extproc. That I think is a bad practice from security perspective since you have no idea to tell which pod is actually accessing the service since this makes potentially tens of envoy poids will access Azure sevice with the same identity.

So, I would suggest to do the same thing as #1394 like by using Azure SDK to get the token dynamically inside the extproc using the SA mounted by the Envoy pod, and insert the token in the header. That would mean you have to write more code in https://github.com/envoyproxy/ai-gateway/tree/main/internal/extproc/backendauth than controller package right now

@mathetake mathetake self-assigned this Oct 28, 2025
@ehrnst
Copy link
Contributor Author

ehrnst commented Oct 28, 2025

Thanks for the work! However, I have a suggestion that would drastically change the current code.

Right now, the impl is using the SA mounted by Envoy AI Gateway controller, and it distributes/copies that identity into the secret that is eventually mounted by extproc. That I think is a bad practice from security perspective since you have no idea to tell which pod is actually accessing the service since this makes potentially tens of envoy poids will access Azure sevice with the same identity.

So, I would suggest to do the same thing as #1394 like by using Azure SDK to get the token dynamically inside the extproc using the SA mounted by the Envoy pod, and insert the token in the header. That would mean you have to write more code in https://github.com/envoyproxy/ai-gateway/tree/main/internal/extproc/backendauth than controller package right now

I agree, I was just trying to get the origina #1183 to work. But I guess this will affect the current Azure authentication implementation as well?

I can see if i can make something work with the help of some collegues and various agents.

@mathetake
Copy link
Member

I agree, I was just trying to get the origina #1183 to work. But I guess this will affect the current Azure authentication implementation as well?

no, we can make it work without affecting it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants