Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails to enable GitOps on AKS cluster #234

Open
ComeChao opened this issue Mar 18, 2022 · 9 comments
Open

Fails to enable GitOps on AKS cluster #234

ComeChao opened this issue Mar 18, 2022 · 9 comments
Assignees
Labels
Azure Azure related issues (AKS / Azure Arc) bug Something isn't working
Milestone

Comments

@ComeChao
Copy link

Expected behaviour

Expected GitOps to be enabled in the cluster.
Expected workflow similar to the one presented in Simplify GitOps with Flux and Visual Studio Code (https://www.youtube.com/watch?v=-07emkW8eiM) by Geert Baeke.

Actual behaviour

After installing the Weaveworks GitOps for VSCODE, selecting a cluster to Enable GitOps, clicking Enable GitOps, and clicking the Enable button on the "Do you want to enable GitOps on the <..> cluster?", I get one error and an input box request:
a. error
image
b. textbox requesting the cluster resource group
image

Regardless of entering the resource group, the only activity I see on the Output (GitOps) is:
image

Steps to reproduce

See above.

Versions

kubectl client version: 1.22.5
kubectl server version: 1.20.9
Flux version: 0.27.4
Git version: 2.35.1.windows.2
Azure version: 2.34.1
Extension version: 0.19.0
VSCode version: 1.65.2
Operating System (OS) and its version: Windows_NT x64 10.0.019042

image

@a1tan
Copy link

a1tan commented Mar 23, 2022

I have the very same problem too. It ends up with below error. I have also tried with WSL and Kubernetes versions 1.21.9 and 1.22.6, result is the same.
image

@josefaworks josefaworks added the bug Something isn't working label Apr 1, 2022
@a1tan
Copy link

a1tan commented Apr 12, 2022

I have upgraded my extension to v0.19.1 and given it a try. After doing this it showed below error which contains the detail of the problem now.
image

After that I realized, AKS has to be created with MSI to run microsoft.flux extension which is not the case for the ones created by Azure Portal I guess. In below url it says enable AKS-Extension and I did that.
https://docs.microsoft.com/en-us/azure/azure-arc/kubernetes/tutorial-use-gitops-flux2#for-azure-kubernetes-service-clusters

After some research I encountered with below url.
https://docs.microsoft.com/en-us/azure/aks/use-managed-identity

Firstly, I have tried to update my existing cluster with this command. Command worked but extension didn't work again.

az aks update -g <RGName> -n <AKSName> --enable-managed-identity

Then I have created a brand new AKS cluster with below command as mentioned on the above link.

az aks create -g myResourceGroup -n myManagedCluster --enable-managed-identity

It finally worked with newly created MSI Cluster. In summary, it seems AKS-ExtensionManager has to be enabled then an MSI AKS cluster has to be created before using Vs Code GitOps Tools.

@juozasg
Copy link
Collaborator

juozasg commented Apr 14, 2022

So a fix for this would be to provide information warning that in AKS, MIS must be enabled before Enable GitOps (install flux into cluster)?

@kingdonb
Copy link
Collaborator

I'm looking for the place in VScode where that output comes, (I've been able to reproduce the issue on my own AKS cluster that was provisioned through the portal, not sure if it was with or without managed identity enabled?)

It looks like I found the messages you are talking about after several successive attempts to enable Flux on my AKS cluster.

The az feature register --namespace Microsoft.ContainerService --name AKS-ExtensionManager command must succeed before later steps will be able to pass. This one itself has several pre-dependencies and from what I can tell, the UI might be providing the right hints to show how to get over these hurdles already, if you know where to find those messages, but each step takes more than a few seconds to process at Azure-side and the interfaces to wait for those events to complete are less than straightforward now from a UX perspective, emitting warnings in the terminal, with links to docs that are all helpful but only providing methods to check on the status of progress where the UX is a huge blob of JSON.

The issue:

is related, because you should be able to install plain Flux without any extension on AKS clusters.

But assuming that you really did want the AKS Flux module, I think we can do better in terms of hand-holding these errors to let you know if you have found the documentation, which issue is yours now, and how much further you have to go, or if the cluster was created with the wrong mode and needed to have a different mode... as far as progress and reporting progress along the way, I'm not sure where this error comes from, but I think we always can try to do better than this for users:

Screen Shot 2022-04-14 at 10 33 50 AM

This happens at the start of the input collection regarding "which cluster, subscription, and resource group is it" – this might explain #218 if I could see the source of this error, so far I haven't figured it out.

I did eventually get the AKS extension enabled by following the docs links and following the prompts

These were all relevant docs links:

https://docs.microsoft.com/en-us/azure/azure-arc/kubernetes/tutorial-use-gitops-flux2#for-azure-kubernetes-service-clusters

https://docs.microsoft.com/en-us/azure/azure-arc/kubernetes/quickstart-connect-cluster?tabs=azure-cli#register-providers-for-azure-arc-enabled-kubernetes

After following these steps, which were all from links I found in the Output error report after things had failed, I still had:

Message:  Request failed to https://management.azure.com/subscriptions/ZZZZ/resourceGroups/aks-kingdon/providers/Microsoft.ContainerService/managedclusters/aks-kingdon-az1/extensionaddons/flux?api-version=2021-03-01. Error code: Forbidden. Reason: Forbidden.{"error":{"code":"AuthorizationFailed","message":"The client 'XXXX' with object id 'YYYY' does not have authorization to perform action 'Microsoft.ContainerService/managedclusters/extensionaddons/read' over scope '/subscriptions/ZZZZ/resourceGroups/aks-kingdon/providers/Microsoft.ContainerService/managedclusters/aks-kingdon-az1/extensionaddons/flux' or the scope is invalid. If access was recently granted, please refresh your credentials."}}

I am working with my corp-it to get that resolved 😅 then I should be able to better reproduce and diagnose issues like this. Meanwhile, hopefully these docs will help someone else who runs into this first issue.

@kingdonb
Copy link
Collaborator

kingdonb commented Apr 30, 2022

I think this is blocked for us.

We are blocking at least #232 and #234 – we need to have an AKS account with a path for both: functioning Azure Arc clusters and AKS clusters with the microsoft.flux extension. We currently have access to neither. I can't believe this circumstance is unique to our two unrelated Azure accounts, there must be some important account onboarding documentation or delegated permission structure that we're still missing.

I will try this again myself, on a new Azure Trial account that I'll create for myself before 0.19.3.

@kingdonb kingdonb added this to the 0.19.3 milestone Apr 30, 2022
@kingdonb kingdonb self-assigned this Apr 30, 2022
@kingdonb
Copy link
Collaborator

kingdonb commented Jun 4, 2022

*We did figure out what was missing in common from both accounts, there is an instruction in the section "Azure Specific Recommendations" which states:

In order to enable GitOps in a cluster you will likely need the --admin credentials.

The default kubeconfig emitted by az cli is not an --admin unless you pass in that flag. It works OK if you remember that instruction. We will have to make sure this gets a nice call-out in the docs (bigger than h1 I guess, because both of us missed it)

@kingdonb
Copy link
Collaborator

kingdonb commented Jun 8, 2022

Delaying this for 0.20.0 – I'll have to take a look at Azure issues again once the extension is published into the marketplace.

@kingdonb kingdonb modified the milestones: 0.19.3, 0.20 Jun 8, 2022
@kingdonb kingdonb modified the milestones: 0.20, 0.21 Jul 18, 2022
@kingdonb kingdonb added the Azure Azure related issues (AKS / Azure Arc) label Aug 11, 2022
@kingdonb kingdonb modified the milestones: 0.21, 0.21.x Aug 11, 2022
@kingdonb
Copy link
Collaborator

We plan to address Azure bugs as soon as possible in the 0.21.x series.

@kingdonb kingdonb modified the milestones: 0.21.x, 0.25 Jul 31, 2023
@kingdonb
Copy link
Collaborator

kingdonb commented Sep 8, 2023

Hello, we have gone a while without checking in. v0.25.1 delivers significant UX improvements, but we haven't re-tested on Azure AKS. Can you say whether it's working well for you, or if you are still interested in using the VSCode GitOps Tools extension for Flux?

@kingdonb kingdonb modified the milestones: 0.25.x, 0.26 Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Azure Azure related issues (AKS / Azure Arc) bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants