-
Notifications
You must be signed in to change notification settings - Fork 39.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
device name change due to azure disk host cache setting #60344
Labels
kind/bug
Categorizes issue or PR as related to a bug.
Comments
k8s-ci-robot
added
kind/bug
Categorizes issue or PR as related to a bug.
sig/azure
needs-sig
Indicates an issue or PR lacks a `sig/foo` label and requires one.
and removed
needs-sig
Indicates an issue or PR lacks a `sig/foo` label and requires one.
labels
Feb 24, 2018
andyzhangx
changed the title
fix device name change due to azure disk host cache setting
device name change due to azure disk host cache setting
Feb 25, 2018
k8s-github-robot
pushed a commit
that referenced
this issue
Feb 25, 2018
Automatic merge from submit-queue (batch tested with PRs 60346, 60135, 60289, 59643, 52640). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. fix device name change issue for azure disk **What this PR does / why we need it**: fix device name change issue for azure disk due to default host cache setting changed from None to ReadWrite from v1.7, and default host cache setting in azure portal is `None` **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Fixes #60344, #57444 also fixes following issues: Azure/acs-engine#1918 Azure/AKS#201 **Special notes for your reviewer**: From v1.7, default host cache setting changed from None to ReadWrite, this would lead to device name change after attach multiple disks on azure vm, finally lead to disk unaccessiable from pod. For an example: statefulset with 8 replicas(each with an azure disk) on one node will always fail, according to my observation, add the 6th data disk will always make dev name change, some pod could not access data disk after that. I have verified this fix on v1.8.4 Without this PR on one node(dev name changes): ``` azureuser@k8s-agentpool2-40588258-0:~$ tree /dev/disk/azure ... └── scsi1 ├── lun0 -> ../../../sdk ├── lun1 -> ../../../sdj ├── lun2 -> ../../../sde ├── lun3 -> ../../../sdf ├── lun4 -> ../../../sdg ├── lun5 -> ../../../sdh └── lun6 -> ../../../sdi ``` With this PR on one node(no dev name change): ``` azureuser@k8s-agentpool2-40588258-1:~$ tree /dev/disk/azure ... └── scsi1 ├── lun0 -> ../../../sdc ├── lun1 -> ../../../sdd ├── lun2 -> ../../../sde ├── lun3 -> ../../../sdf ├── lun5 -> ../../../sdh └── lun6 -> ../../../sdi ``` Following `myvm-0`, `myvm-1` is crashing due to dev name change, after controller manager replacement, myvm2-x pods work well. ``` Every 2.0s: kubectl get po Sat Feb 24 04:16:26 2018 NAME READY STATUS RESTARTS AGE myvm-0 0/1 CrashLoopBackOff 13 41m myvm-1 0/1 CrashLoopBackOff 11 38m myvm-2 1/1 Running 0 35m myvm-3 1/1 Running 0 33m myvm-4 1/1 Running 0 31m myvm-5 1/1 Running 0 29m myvm-6 1/1 Running 0 26m myvm2-0 1/1 Running 0 17m myvm2-1 1/1 Running 0 14m myvm2-2 1/1 Running 0 12m myvm2-3 1/1 Running 0 10m myvm2-4 1/1 Running 0 8m myvm2-5 1/1 Running 0 5m myvm2-6 1/1 Running 0 3m ``` **Release note**: ``` fix device name change issue for azure disk ``` /assign @karataliu /sig azure @feiskyer could you mark it as v1.10 milestone? @brendandburns @khenidak @rootfs @jdumars FYI Since it's a critical bug, I will cherry pick this fix to v1.7-v1.9, note that v1.6 does not have this issue since default cachingmode is `None`
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
What happened:
From v1.7, default host cache setting changed from
None
toReadWrite
, this would lead to device name change after attach multiple disks on azure vm, finally lead to disk unaccessiable from pod.What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
Following statefulset with 8 replicas will always fail due to dev name change.
Anything else we need to know?:
Environment:
kubectl version
): v1.7 - v1.10uname -a
):/sig azure
/assign
The text was updated successfully, but these errors were encountered: