Skip to content

DLPX-67251 Device removal fails due to inconsistent device names #162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 16, 2019

Conversation

shartse
Copy link

@shartse shartse commented Nov 15, 2019

Problem
On ESX, we use by-link links to create pools and to import them on migrated systems. There are two links per device, each with the same base number but with a different prefix (wwn vs scsi). When creating a pool for the first time, we explicitly use the wwn link, if it's available.

However, it is not possible to specify this when importing a pool and since device links are created by udev asynchronously, sometimes a migrated pool can end up with a combination of wwn and scsi linked devices.

This causes issues when we try to manage the devices with zpool commands passed through the delphix application. The DE thinks the correct name of the pool is the wwn version (since it's now present), but that doesn’t actually exist in the pool and the operation fails.

Possible Solutions

  1. Expand zpool import to be able to specify precise files (and prefixes) to always pick the wwn links
  2. Wait until all links are created before importing (most likely, by calling udevadm settle)
  3. Modify the udev rules so that only one by-id link per device is created.
  4. Change DE so that it can identify devices within pools with greater flexibility.

I decided against 1 and 4 since they'd require somewhat larger scope changes in ZFS and the app-stack respectively. 2 would add extra latency to migration, especially since settle waits for all different udev events, not just the ones pertinent to the devices we care about.

I found that option 3 can be achieved by modifying the existing udev rules, so that's what I've gone with here. I've also opened an app-gate review here: http://reviews.delphix.com/r/54112 to codify the use of the scsi prefix ids by default.

In the future, I'd like to move towards a solution where we write our own udev rules that covers devices across all platforms we support to get a single "delphix-id" and hopefully reduce the complexity here and in the app-stack.

Testing
I manually tested that a migration to a VM with this change was successful and saw that the pool was created using all scsi links.

domain0                                   22.5G  44.7M  22.5G        -         -     0%     0%  1.00x    ONLINE  -
  scsi-36000c299e8f8f8e643410a9a5aaf3595  7.50G  23.8M  7.48G        -         -     0%  0.30%      -  ONLINE  
  scsi-36000c29ca33c1d0b3f1ee232d02652fd  7.50G  16.7M  7.48G        -         -     0%  0.21%      -  ONLINE  
  scsi-36000c29359be70663180556fbd93d05a  7.50G  4.22M  7.50G        -         -     0%  0.05%      -  ONLINE  

I also checked that configuring, adding and removing the devices all worked as expected on a migrated as well as a clean installed VM. I also tested the change on a GCP VM (the other platform we use by-id links on) and found no change in behavior.

Automated tests: git-ab-pre-push --test-upgrade-from 5.3.6.0 -p esx http://selfservice.jenkins.delphix.com/job/devops-gate/job/master/job/appliance-build-orchestrator-pre-push/2500/

@sebroy
Copy link
Contributor

sebroy commented Nov 15, 2019

bors delegate+

@bors
Copy link
Contributor

bors bot commented Nov 15, 2019

✌️ shartse can now approve this pull request. To approve and merge a pull request, simply reply with bors r+. More detailed instructions are available here.

Copy link
Contributor

@pzakha pzakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High level question regarding this

2. Wait until all links are created before importing (most likely, by calling udevadm settle)

delphix-migration.service already depends on After=systemd-udev-settle.service.
From my understanding, the issue is that when we import by-id, we do so with:
zpool import -fN -d /dev/disk/by-id domain0
So there is no way to tell zfs to use scsi links instead of wwn links, is that right?

@shartse
Copy link
Author

shartse commented Nov 15, 2019

@pzakha - yeah, that's right. You can provide a directory to -d but not specific files or a prefix.

Hm, if we're already waiting for udev settle maybe my understanding of why we're getting a mix of links isn't quite right. This fix should prevent it though.

@shartse
Copy link
Author

shartse commented Nov 16, 2019

bors r+

bors bot added a commit that referenced this pull request Nov 16, 2019
162: DLPX-67251 Device removal fails due to inconsistent device names r=shartse a=shartse

**Problem**
On ESX, we use by-link links to create pools and to import them on migrated systems. There are two links per device, each with the same base number but with a different prefix (wwn vs scsi). When creating a pool for the first time, we explicitly use the wwn link, if it's available. 

However, it is not possible to specify this when importing a pool and since device links are created by udev asynchronously, sometimes a migrated pool can end up with a combination of wwn and scsi linked devices.

This causes issues when we try to manage the devices with zpool commands passed through the delphix application. The DE thinks the correct name of the pool is the wwn version (since it's now present), but that doesn’t actually exist in the pool and the operation fails. 

**Possible Solutions** 
1. Expand zpool import to be able to specify precise files (and prefixes) to always pick the wwn links
2. Wait until all links are created before importing (most likely, by calling `udevadm settle`)
3. Modify the udev rules so that only one by-id link per device is created. 
4. Change DE so that it can identify devices within pools with greater flexibility. 

I decided against 1 and 4 since they'd require somewhat larger scope changes in ZFS and the app-stack respectively. 2 would add extra latency to migration, especially since `settle` waits for all different udev events, not just the ones pertinent to the devices we care about. 

I found that option 3 can be achieved by modifying the existing udev rules, so that's what I've gone with here. I've also opened an app-gate review here: http://reviews.delphix.com/r/54112 to codify the use of the scsi prefix ids by default.  

In the future, I'd like to move towards a solution where we write our own udev rules that covers devices across all platforms we support to get a single "delphix-id" and hopefully reduce the complexity here and in the app-stack.

**Testing**
I manually tested that a migration to a VM with this change was successful and saw that the pool was created using all scsi links. 
```
domain0                                   22.5G  44.7M  22.5G        -         -     0%     0%  1.00x    ONLINE  -
  scsi-36000c299e8f8f8e643410a9a5aaf3595  7.50G  23.8M  7.48G        -         -     0%  0.30%      -  ONLINE  
  scsi-36000c29ca33c1d0b3f1ee232d02652fd  7.50G  16.7M  7.48G        -         -     0%  0.21%      -  ONLINE  
  scsi-36000c29359be70663180556fbd93d05a  7.50G  4.22M  7.50G        -         -     0%  0.05%      -  ONLINE  
```
I also checked that configuring, adding and removing the devices all worked as expected on a migrated as well as a clean installed VM. I also tested the change on a GCP VM (the other platform we use by-id links on) and found no change in behavior. 

Automated tests: `git-ab-pre-push --test-upgrade-from 5.3.6.0 -p esx` http://selfservice.jenkins.delphix.com/job/devops-gate/job/master/job/appliance-build-orchestrator-pre-push/2500/

Co-authored-by: sara hartse <sara.hartse@delphix.com>
@bors
Copy link
Contributor

bors bot commented Nov 16, 2019

Build succeeded

  • continuous-integration/travis-ci/push

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants