Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

Multiple clusters - dynamic PVCs try to access wrong storage account (of other resource group) #2768

Closed
DonMartin76 opened this issue Apr 25, 2018 · 8 comments

Comments

@DonMartin76
Copy link
Contributor

Is this a request for help?: Yes


Is this an ISSUE or FEATURE REQUEST? (choose one): Issue


What version of acs-engine?: 0.16.0


Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm): Kubernetes 1.9.7

What happened: We are provisioning persistent volumes using the default storage class, and this worked fairly good in 1.9.6, but there were some issues when pods were rescehduled, and the volumes had to be reattached to a different node. Thus we upgraded to 1.9.7. How we reproduced:

  1. In resource group A, provision a cluster. There, create a PVC using dynamic provisioning. This creates (implicitly) a storage account in this resource group, seemingly called ds<some hex number>
  2. In a second resource group B (in the same subscription), create another cluster (this works), and then try to provision a dynamic volume. This fails with the following error message (in kubectl describe pvc):
Failed to provision volume with StorageClass "default": azureDisk - account ds6c822a4d484211eXXXXXX does not exist while trying to create/ensure default container

Now, the freaky part here is that this storage account, which is being referred to in resource group B, actually exists, but in resource group A.

What you expected to happen: The cluster in resource group B must create its own storage account for dynamic volumes.


We will revert to 1.9.6 to see whether that helps for now, but as stated, 1.9.6 has other issues :-(. Chances are "good" that the most recent 1.10.1 release also has this exact same problem.

@DonMartin76
Copy link
Contributor Author

@andyzhangx Pointed me to this: kubernetes/kubernetes#55837 This issue is fixed as of 1.10.0, he says. Keeping this around for reference?

@andyzhangx
Copy link
Contributor

@DonMartin76 No, it's a different bug. This bug is fixed by kubernetes/kubernetes#56474, also paste my analysis here:
Root cause is:

this bug only exists in blob based VM in v1.8.x and v1.9.x, managed disk VM won't have this issue

@andyzhangx
Copy link
Contributor

this bug only exists in blob based VM in v1.8.x and v1.9.x, so if you specify ManagedDisks when creating k8s cluster, it won't have this issue.

    "agentPoolProfiles": [
      {
        ...
        "storageProfile" : "ManagedDisks",
        ...
      }

@jackfrancis
Copy link
Member

Thanks @andyzhangx. @DonMartin76 should we classify this as a known (won't fix) issue w/ 1.8 and 1.9 with non-ManagedDisks vms?

@DonMartin76
Copy link
Contributor Author

@jackfrancis I don't think there's an awful lot you can do about it right now, except for waiting for it to be fixed upstream (I think @andyzhangx already flagged it for cherry picking at least to 1.9). It's documented here then, and the fix with the managed disks is also there. I would otherwise had said it can't really be a "won't fix", as VHD is still the default for acs-engine.

Will managed disks be the default sometime in the future? Are there drawbacks or benefits? Other than this now.

@jackfrancis
Copy link
Member

@DonMartin76 It's a good question about ManagedDisks being default. The add'l Azure spend is the only thing I'm aware that might be objectionable to folks, would be nice to take a poll about this (does github do polls? 😝 )

@DonMartin76
Copy link
Contributor Author

How much is that additional spend? I see one managed disk per node, 32 GB premium instances; and additionally the bigger 512 GB etcd disk(s), which in my eyes would be the main cost drivers (at around 50-60€ per month). And then the dynamically provisioned ones (which appear to be Standard HDD by default, why is that? I actually see they are technically faster than small premium ones though).

@jackfrancis
Copy link
Member

We're gonna do this here: #2799

@andyzhangx suggests that this is already the case; that if your VM supports managed disk, you'll get it by default. Let me know if you experience otherwise!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants