-
Notifications
You must be signed in to change notification settings - Fork 3.9k
ARROW-765: [Python] Add more natural Exception type hierarchy for thirdparty users #489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Change-Id: Id73771a791180dd48a6855355d0e42b605fdbd47
…arty users Change-Id: Ib137246f3ee39bd622a27ff261f438366a6e257b
| pass | ||
|
|
||
|
|
||
| class ArrowInvalid(ValueError, ArrowException): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this an antipattern (deriving from a built-in exception or, worse, raising them directly)? If a user combines some Arrow code with their own in a try block, they may inadvertently catch, say, an ArrowInvalid Exception when they explicitly were trying to handle a ValueError for non-arrow code in the try block.
tl;dr if you structure your Exceptions like this, client code has no idea from where the issue originated and has no easy way of disambiguating in the except block (if they're even aware of the Arrow Exception class hierarchy to begin with).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it an antipattern? I have never thought there was anything wrong either with
- deriving from built-in exceptions, like ValueError
- raising built-in exceptions directly (pretty much all Python data libraries -- pandas, NumPy, scikit-learn, etc. already do this)
So there's a couple ways to go:
- Don't define user-defined exceptions, raise built-in exceptions only
- Don't derive from a base ArrowException -- which is there to help indicate that the error arose in C++ code
- Derive from base Exception (worst option, IMHO)
adding @jreback for more opinions (I did this at his asking)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my main concern was to NOT have the arrow exception hierarchy just be based on Exception. Being able to catch ValueError, TypeError, KeyError, IndexError (subclass) as errors is very pythonic.
As for the naming having ArrowTypeError(ArrowException, TypeError). I am not how useful that actually is in practice. I would say if you have a need to differentiate between two (or more TypeError s), then having a custom exception is useful (as opposed to having to check the actual error message).
So fine with this PR (though maybe raise the standard python exceptions, unless you have a real need for them).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have a couple of these in pandas: https://github.com/pandas-dev/pandas/blob/master/pandas/errors/__init__.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference is to only raise these exceptions for errors that originate from the C++ libraries, otherwise use the Python built-in ones. Sound good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm.
I also took the liberty of changing a number of error types in libarrow_python