-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] fairness scores should work on non-binary indicators #264
Comments
I'm currently having a go at this. I have an implementation that passes the tests of Thoughts @MBrouns ? Mainly the line def equal_opportunity_score(sensitive_column, positive_target=1):
r"""
The equality opportunity score calculates the ratio between the probability of a **true positive** outcome
given the sensitive attribute (column) being true and the same probability given the
sensitive attribute being false.
.. math::
\min \left(\frac{P(\hat{y}=1 | z=1, y=1)}{P(\hat{y}=1 | z=0, y=1)},
\frac{P(\hat{y}=1 | z=0, y=1)}{P(\hat{y}=1 | z=1, y=1)}\right)
This is especially useful to use in situations where "fairness" is a theme.
Usage:
`equal_opportunity_score('gender')(clf, X, y)`
Source:
- M. Hardt, E. Price and N. Srebro (2016), Equality of Opportunity in Supervised Learning
:param sensitive_column:
Name of the column containing the binary sensitive attribute (when X is a dataframe)
or the index of the column (when X is a numpy array).
:param positive_target: The name of the class which is associated with a positive outcome
:return: a function (clf, X, y_true) -> float that calculates the equal opportunity score for z = column
"""
def impl(estimator, X, y_true):
"""Remember: X is the thing going *in* to your pipeline."""
sensitive_col = (
X[:, sensitive_column] if isinstance(X, np.ndarray) else X[sensitive_column]
)
y_hat = estimator.predict(X)
p_ys_zs = []
for subgroup in np.unique(sensitive_col):
y_given_zi_yi = y_hat[(sensitive_col == subgroup) & (y_true == positive_target)]
# If we never predict a positive target for one of the subgroups, the model is by definition not
# fair so we return 0
if len(y_given_zi_yi) == 0:
warnings.warn(
f"No samples with y_hat == {positive_target} for {sensitive_column} == 1, returning 0",
RuntimeWarning,
)
return 0
p_ys_zs.append(np.mean(y_given_zi_yi == positive_target))
#Getting the min of all pair-wise divisions is the same as getting the min of each mirror pair?
score = np.minimum(*[ pair[0] / pair[1] for pair in it.permutations(p_ys_zs, 2)])
return score if not np.isnan(score) else 1.
return impl |
I think it makes sense in the way you've currently written it, although it would be nice to see a few test cases to show the impact and behaviour |
Thanks, good to know the logic works! I'll start working on some tests for it. |
ValueError: equal_opportunity_score only supports binary indicator columns for
column
. Found values ['Black' 'White']The text was updated successfully, but these errors were encountered: