Skip to content

Use New BIO_RW_FAILFAST_* API #3

Closed
@behlendorf

Description

@behlendorf

As of zfs-0.4.5 we no longer unconditionally use the BIO_RW_FAILFAST flag. In newer kernels BIO_RW_FAILFAST was replaced with IO_RW_FAILFAST_{DEV|_TRANSPORT|_DRIVER}. The API change is a step in the right direction but the vdev disk code needs to be updated to take advantage of the new API.

For now if the legacy BIO_RW_FAILFAST flag is detected at configure time we use it. If it is missing it means we are running against a kernel with the newer API. With the new API we should be able to ensure some fairly smart behavior in the face of IO errors, but until then we are going to have a crazy number of unless retries at the lower layers.

Last week I observed a real disk failure while zfs-0.4.7 was under a kpios write load in RHEL5.4. While the software did handle the failure the BIO_RW_FAILFAST support does not appear to have worked correctly. The single drive failure hung the committing transaction group for at least 360 seconds while the low level mptsas driver retried the IO. Additionally, from what I can tell it does not appear that the IO failure was properly reported back to ZFS either. What we need to have happen is for the IO to fail immediately with minimal retries at the driver and scsi layer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: FeatureFeature request or new feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions