Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows PartitionsWithContext is over cautious with error handling. #572

Open
scudette opened this issue Aug 22, 2018 · 5 comments
Open

Comments

@scudette
Copy link

Windows PartitionsWithContext exits the function in several places where the win32 API can return errors for certain operations. But exiting the entire function means that no partition information is returned at all. This is problematic because programs that rely on this function suddenly and unexpectedly break when inaccessible volumes are attached.

For example attaching a locked bitlocker volume to the system will cause any program to return 0 drives and an error. This happens because provGetVolumeInformation.Call() returns an error on this volume something like "This drive is locked by BitLocker".

This function should return all the information it has on all drives - even if it cant get some information due to the error. Let the caller deal with handling certain errors.

@Lomanic
Copy link
Collaborator

Lomanic commented Aug 23, 2018

So basically we should not return ASAP at

return ret, err
, problem is then which error should be returned if we have more than one error while we continue execution, and how to link numerous errors to their respective partition?

I see different options:

  • don't throw the error at line 116, append a PartitionStat with empty Fstype and Opts in ret instead. I can't say if the returned error should be nil or not.
  • same as above, but add and set with the error a new Err field in PartitonStat, not very idiomatic but might not break too much downstream code. Still can't say if the returned error should be nil or not.
  • return an array of errors instead (breaks back compatibility so for the future v3 Plan to v3 #362). One error (may be nil) for each PartitonStat returned?
  • instead of returning an array of errors, return a single error with a new type implementing the error type (see golang blog) with a field showing which PartitionStat failed with which error. The error message might tell the API consumer to check this field so they might take action on each PartitonStat affected. Looks the most idiomatic to me and it doesn't break back compatibility (for the majority of downstream codebases using ret, err := PartitionsWithContext(); err != nil at least).

To sum up:

  • should the error at line 116 be simply discarded?
  • should we implement our own error type with more details for each failed PartitionStat?
  • should we return an array of errors?

I'm leaning towards the custom error type.
Your advice @shirou?

@scudette
Copy link
Author

Option 3 would not break downstream code but it would also not fix them either. If callers do the naive if err != nil thing then they would still be surprised when an error is raised under some situations (as we were). Code would need to active be changed to check the type of the error to see if something could still be salvaged even if an error is returned.

My preference is that the PartitionStat struct be extended to contain the error for each partition if the callers cared to inspect it, otherwise just leave the fstype and options fields empty.

@shirou
Copy link
Owner

shirou commented Aug 25, 2018

Perhaps I don't get what the actual problem is, but from my understanding the issue is, only an error will be returned if invalid partition exists.

First of all, PartitionsWithContext may returns error as declared return value. Error handling is caller responsibility. So we don't care about if some programs break without certain error handling.

Second of all, if there are several partitions and only one partition has error, PartitionsWithContext returns an error strill there are some normal partition exists. But changing return value to []error breaks compatibility. So my prefer opinion is @Lomanic 's option 4, create an error which has []error inside. But it is a little bit hard to use from other programs, so I think just make a csv string, and return error with that string.

This means, like this.

	var errors []string
	for _, v := range lpBuffer {
		(snip)
				if driveret == 0 {
					if typeret == 5 || typeret == 2 {
						continue //device is not ready will happen if there is no disk in the drive
					}
					errors = append(errors, fmt.Sprintf("%s has error", typepath))
					continue
				}
				(snip)
	}
	if len(errors) != 0 {
		return ret, errors.New(strings.Join("," errors))
	}
	return ret, nil

What do you think?

@scudette
Copy link
Author

This works for me. The main thing is that it continues the loop and finishes enumerating all the drives. It perfectly fine to omit the drives which have errors but at least we can get the other drives.

The use pattern will be a little more complicated than just check for err != nil but that's ok. One will have to check for len of result > 0 means there are valid results even if err != nil, if the caller cares about other partitions then they would look in err more closely.

The more idiomatic approach is to make a new Error type with a list of errors in it, then users can check for it specifically. It will not break existing code since users will need to type assert it to access the new functionality (existing code will just see an error and do what it already does - but of course it will still be broken because it will probably raise the error up the stack.):

type PartitionErrors struct {
    PartitionErrors []string
}

func (e PartitionErrors) Error() string {
   ....
}

Then users can just check for it if they cared:

res, err := PartitionsWithContext(...)
if err != nil {
     switch t:= err.(type) {
     case PartitionErrors:     // Handle partial failure mode 
         ....
     default:     // Something else went wrong - bubble error up.
       return err   

Either of the two options are ok moving forward but they do not fix existing usage. As you say " Error handling is caller responsibility. So we don't care about if some programs break without certain error handling." but this is the whole point - existing callers are handling the errors and because of this handling they reject correct results and receive an unexpected bug (i.e. they get total failure and no results in cases where there is only a partial failure).

You are effectively changing the contract of the function moving forward.

@Lomanic
Copy link
Collaborator

Lomanic commented Aug 27, 2018

You are effectively changing the contract of the function moving forward.

Not that much I think. Previous code using err != nil for error checking still errors out with the same conditions (and this is intended), but with more PartitonStat returned (not stopping at first error) and the consumer can now type-assert this error to check if it only failed to fetch some fs properties and act accordingly.

Not returning the error encountered while retrieving fs properties is breaking back-compatibility in a harder way than this new custom error type. Between minor releases we have to keep the same behavior and API.

Adding the Err field in PartitionStat adds another way to check errors while we already have the error returned by the function. I can find some implementations using custom error types to wrap a list of errors (even in the stdlib), it looks like the consensus in this reddit thread and feels more idiomatic to me.

Of course the documentation will have to be updated to show how to properly check this error and how to know which PartitionStat failed.

One will have to check for len of result > 0 means there are valid results even if err != nil

This can be the case with the current code, if we fail at the mentioned line 116 we don't return empty results but what was found until this error occurred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants