Multiple clusters - dynamic PVCs try to access wrong storage account (of other resource group) #2768

DonMartin76 · 2018-04-25T13:53:59Z

Is this a request for help?: Yes

Is this an ISSUE or FEATURE REQUEST? (choose one): Issue

What version of acs-engine?: 0.16.0

Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm): Kubernetes 1.9.7

What happened: We are provisioning persistent volumes using the default storage class, and this worked fairly good in 1.9.6, but there were some issues when pods were rescehduled, and the volumes had to be reattached to a different node. Thus we upgraded to 1.9.7. How we reproduced:

In resource group A, provision a cluster. There, create a PVC using dynamic provisioning. This creates (implicitly) a storage account in this resource group, seemingly called ds<some hex number>
In a second resource group B (in the same subscription), create another cluster (this works), and then try to provision a dynamic volume. This fails with the following error message (in kubectl describe pvc):

Failed to provision volume with StorageClass "default": azureDisk - account ds6c822a4d484211eXXXXXX does not exist while trying to create/ensure default container

Now, the freaky part here is that this storage account, which is being referred to in resource group B, actually exists, but in resource group A.

What you expected to happen: The cluster in resource group B must create its own storage account for dynamic volumes.

We will revert to 1.9.6 to see whether that helps for now, but as stated, 1.9.6 has other issues :-(. Chances are "good" that the most recent 1.10.1 release also has this exact same problem.

The text was updated successfully, but these errors were encountered:

DonMartin76 · 2018-04-25T14:39:31Z

@andyzhangx Pointed me to this: kubernetes/kubernetes#55837 This issue is fixed as of 1.10.0, he says. Keeping this around for reference?

andyzhangx · 2018-04-25T14:45:31Z

@DonMartin76 No, it's a different bug. This bug is fixed by kubernetes/kubernetes#56474, also paste my analysis here:
Root cause is:

In newBlobDiskController func : StorageAccountClient.List() would get all storage accounts from current subscription
https://github.com/kubernetes/kubernetes/blob/release-1.9/pkg/cloudprovider/providers/azure/azure_blobDiskController.go#L434-L435
while when creating blob disk, it only get storage account properties from current resource group, so if the first account is outside of current resource group, an error will happen
https://github.com/kubernetes/kubernetes/blob/release-1.9/pkg/cloudprovider/providers/azure/azure_blobDiskController.go#L586-L587

this bug only exists in blob based VM in v1.8.x and v1.9.x, managed disk VM won't have this issue

andyzhangx · 2018-04-25T14:46:31Z

this bug only exists in blob based VM in v1.8.x and v1.9.x, so if you specify ManagedDisks when creating k8s cluster, it won't have this issue.

    "agentPoolProfiles": [
      {
        ...
        "storageProfile" : "ManagedDisks",
        ...
      }

jackfrancis · 2018-04-25T18:49:53Z

Thanks @andyzhangx. @DonMartin76 should we classify this as a known (won't fix) issue w/ 1.8 and 1.9 with non-ManagedDisks vms?

DonMartin76 · 2018-04-25T19:25:32Z

@jackfrancis I don't think there's an awful lot you can do about it right now, except for waiting for it to be fixed upstream (I think @andyzhangx already flagged it for cherry picking at least to 1.9). It's documented here then, and the fix with the managed disks is also there. I would otherwise had said it can't really be a "won't fix", as VHD is still the default for acs-engine.

Will managed disks be the default sometime in the future? Are there drawbacks or benefits? Other than this now.

jackfrancis · 2018-04-25T22:26:54Z

@DonMartin76 It's a good question about ManagedDisks being default. The add'l Azure spend is the only thing I'm aware that might be objectionable to folks, would be nice to take a poll about this (does github do polls? 😝 )

DonMartin76 · 2018-04-26T07:17:25Z

How much is that additional spend? I see one managed disk per node, 32 GB premium instances; and additionally the bigger 512 GB etcd disk(s), which in my eyes would be the main cost drivers (at around 50-60€ per month). And then the dynamically provisioned ones (which appear to be Standard HDD by default, why is that? I actually see they are technically faster than small premium ones though).

jackfrancis · 2018-04-30T22:43:38Z

We're gonna do this here: #2799

@andyzhangx suggests that this is already the case; that if your VM supports managed disk, you'll get it by default. Let me know if you experience otherwise!

DonMartin76 mentioned this issue Apr 25, 2018

Log from pv_controller.go regarding volume provisioning problem on 1.9.7 andyzhangx/demo#4

Open

jackfrancis added the orchestrator/k8s label Apr 25, 2018

jackfrancis closed this as completed Apr 30, 2018

DonMartin76 mentioned this issue May 15, 2018

Make ManagedDisk default #2799

Closed

jackfrancis mentioned this issue May 16, 2018

Kubernetes: Managed Disk vs Storage Account #2993

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple clusters - dynamic PVCs try to access wrong storage account (of other resource group) #2768

Multiple clusters - dynamic PVCs try to access wrong storage account (of other resource group) #2768

DonMartin76 commented Apr 25, 2018

DonMartin76 commented Apr 25, 2018

andyzhangx commented Apr 25, 2018

andyzhangx commented Apr 25, 2018

jackfrancis commented Apr 25, 2018

DonMartin76 commented Apr 25, 2018

jackfrancis commented Apr 25, 2018

DonMartin76 commented Apr 26, 2018

jackfrancis commented Apr 30, 2018

Multiple clusters - dynamic PVCs try to access wrong storage account (of other resource group) #2768

Multiple clusters - dynamic PVCs try to access wrong storage account (of other resource group) #2768

Comments

DonMartin76 commented Apr 25, 2018

DonMartin76 commented Apr 25, 2018

andyzhangx commented Apr 25, 2018

andyzhangx commented Apr 25, 2018

jackfrancis commented Apr 25, 2018

DonMartin76 commented Apr 25, 2018

jackfrancis commented Apr 25, 2018

DonMartin76 commented Apr 26, 2018

jackfrancis commented Apr 30, 2018