You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: source/getting_started/concepts.rst
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,7 +24,7 @@ advances enable all of Engine's insight and analysis capabilities, including sta
24
24
25
25
- Outperform other commonly used feature importance metrics, including SHAP.
26
26
27
-
Howso quantifies individual feature contributions to a prediction, i.e., how much an individual feature impacts a prediction. The concept of feature contribution is similar to the data science concept of "feature importance". However,
27
+
Howso quantifies individual prediction contributions to a prediction, i.e., how much an individual feature impacts a prediction. The concept of prediction contributions is similar to the data science concept of "feature importance". However,
28
28
Howso is robust against several common challenges (correlated features, redundant features, difference in scale between features, and multiple distinguishing features) faced by other feature importance tools,
29
29
including the SHAP metric, which often lead to misleading results.
Copy file name to clipboardExpand all lines: source/getting_started/intro.rst
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -69,7 +69,7 @@ Howso values gracious intellectual honesty. In that spirit, we're telling you up
69
69
- Very large datasets
70
70
71
71
Handling very large datasets with subtle signals (e.g., datasets requiring tens of millions of records and/or thousands of features to capture the complex relationships within the data)
72
-
currently requires manual work from engineering, data science, and subject matter expert teams. However, currently available Howso tools, including ablation and non-robust feature contribution calculations,
72
+
currently requires manual work from engineering, data science, and subject matter expert teams. However, currently available Howso tools, including ablation and non-robust prediction contribution calculations,
73
73
can be used to help identify subsamples of large datasets that
74
74
contain enough signal to be used for data science analysis.
Copy file name to clipboardExpand all lines: source/getting_started/terminology.rst
+15-16Lines changed: 15 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -98,40 +98,39 @@ The mean absolute error between a predicted value and actual value for a predict
98
98
uncertainty. Residuals may be for a given prediction, and expected Residuals may be for a given feature, either
99
99
globally across the entire model or for a particular prediction.
100
100
101
-
.. _contribution:
101
+
.. _pc:
102
102
103
-
Contribution
104
-
------------
103
+
Prediction Contributions (PC)
104
+
-----------------------------
105
105
106
-
Feature contribution is the difference between a prediction in an action feature when each feature or case is
107
-
considered versus not considered. Case contribution is the same but for a case rather than a feature. When applied in
106
+
Prediction contributions is the measured difference between a prediction in an action feature when each feature (Feature Prediction Contributions)
107
+
or case (Case Prediction Contributions) is considered versus not considered. When Feature Prediction Contributions is applied in
108
108
a robust fashion, this is an approximation of the commonly used SHAP feature importance measure. The difference being
109
109
that SHAP is an exact value of a model (which itself is just an approximation of the data) whereas robust contribution is an
110
110
approximation of the feature importance of the relationships expressed in the data.
111
111
112
-
.. _mda:
112
+
.. _ac:
113
113
114
-
MDA
115
-
---
116
-
117
-
The *Mean Decrease in Accuracy* (MDA) of an Action Feature is mean decrease in accuracy of removing a feature. MDA units are on the same scale as the Action feature(s), and will be probabilities for categorical features.
114
+
Accuracy Contributions (AC)
115
+
---------------------------
116
+
Accuracy contributions is the accuracy difference in an action feature when each feature (Feature Accuracy Contributions)
117
+
or case (Case Accuracy Contributions) is considered versus not considered.
118
118
119
119
.. _robust:
120
120
121
121
Robust
122
122
------
123
123
124
-
A feature or case contribution or MDA that is robust means that it is computed over the power set of possible
125
-
combinations of features or cases, as approximated by a uniform distribution. For feature contributions, robust means
124
+
A feature or case contribution that is robust means that it is computed over the power set of possible
125
+
combinations of features or cases, as approximated by a uniform distribution. For prediction contributions, robust means
126
126
it is an approximation to the well-known SHAP values.
127
127
128
128
.. _relavant_features:
129
129
130
130
Relevant Features
131
131
-----------------
132
132
133
-
Features whose values were important in determining prediction value(s). Generally, this refers to feature MDA or
134
-
contribution, which yield similar but complementary insights.
133
+
Features whose values were important in determining prediction value(s). Generally, this refers to prediction or accuracy contributions, which yield similar but complementary insights.
135
134
136
135
.. _contexts:
137
136
@@ -313,8 +312,8 @@ Influential Cases
313
312
314
313
The cases which were identified as most influential during a prediction, along with their weights when predicting the
315
314
expected value or drawing a value from the distribution of expected values for generative outputs. The influential
316
-
cases are a subset of the :ref:`most_similar_cases`, returning only those cases whose cumulative influence weights added in
317
-
descending order is below the influential weight threshold.
315
+
cases are a subset of the :ref:`most_similar_cases`, returning only those cases whose cumulative influence weights added in
316
+
descending order is below the influential weight threshold.
Copy file name to clipboardExpand all lines: source/user_guide/advanced_capabilities/case_importance.rst
+12-13Lines changed: 12 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,9 +29,8 @@ Concepts & Terminology
29
29
30
30
How-To Guide
31
31
------------
32
-
Case importance is similar to feature importance in that it comprises of two metrics, case mean decrease in accuracy (MDA) and case contribution.
33
-
As opposed to influential and similar cases which examines the influence of cases on a single case or prediction, case importance examines how important a case is in regards to the overall predictions on a group of cases. Case importance share the same underlying methodology with :doc:`Feature Importance <feature_importance>`.
34
-
Unlike feature contributions, case contributions are calculated just locally. Conceptually, local metrics use either a specific subset of the cases that are trained into the Trainee or a set of new cases.
32
+
Case importance is similar to feature importance in that it comprises of two metrics, Accuracy Contributions for Case and Prediction Contributions for Case.
33
+
Unlike global feature importance metrics, case contributions are calculated just locally. Conceptually, local metrics use either a specific subset of the cases that are trained into the Trainee or a set of new cases.
35
34
36
35
Setup
37
36
^^^^^
@@ -41,19 +40,19 @@ The :class:`~Trainee` will be referenced as ``trainee`` in the sections below.
41
40
Case Contributions
42
41
^^^^^^^^^^^^^^^^^^
43
42
44
-
Case contributions can be retrieved by setting ``case_contributions_robust`` or ``case_contributions_full`` to ``True``.
43
+
Case contributions can be retrieved by setting ``case_robust_prediction_contributions`` or ``case_full_prediction_contributions`` to ``True``.
Copy file name to clipboardExpand all lines: source/user_guide/advanced_capabilities/feature_importance.rst
+24-47Lines changed: 24 additions & 47 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,13 +5,13 @@ Feature Importance
5
5
==================
6
6
.. topic:: What is covered in this user guide
7
7
8
-
In this guide, you will learn how to compute the feature importance metrics, :ref:`Feature Contributions <contribution>` and :ref:`Feature Mean Decrease in Accuracy (MDA) <mda>` from a Trainee. Feature importance metrics
8
+
In this guide, you will learn how to compute the feature importance metrics, :ref:`Prediction Contributions (PC) <pc>` and :ref:`Accuracy Contributions (AC) <ac>` from a Trainee. Feature importance metrics
9
9
provides information about which features are useful for predicting a target or :ref:`action <action_features>` feature. In addition to learning informative metrics about the data and the model, these insights can be used as guidance for further action such as feature selection or feature engineering.
10
10
11
11
12
12
Objectives: what you will take away
13
13
-----------------------------------
14
-
- **How-To** Retrieve the different types of feature importance metrics across several different categories: :doc:`global vs local <../concepts/global_vs_local>`, and :ref:`robust` vs non-robust (full) :ref:`Feature Contributions <contribution>` and :ref:`Feature MDA <mda>`.
14
+
- **How-To** Retrieve the different types of feature importance metrics across several different categories: :doc:`global vs local <../concepts/global_vs_local>`, and :ref:`robust` vs non-robust (full) :ref:`Prediction Contributions <pc>` and :ref:`Accuracy Contributions <ac>`.
15
15
16
16
17
17
Prerequisites: before you begin
@@ -33,9 +33,9 @@ recommend being familiar with the following concepts:
33
33
- :ref:`residual`
34
34
- :ref:`robust`
35
35
- :ref:`contribution`
36
-
- :ref:`mda`
36
+
- :ref:`ac`
37
37
38
-
The two metrics available for feature importance is feature :ref:`contribution` and feature :ref:`mda`.
38
+
The two metrics available for feature importance is feature :ref:`contribution` and feature :ref:`ac`.
39
39
40
40
Robust vs Non-Robust (Full)
41
41
^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -52,34 +52,34 @@ The created :class:`~Trainee` will be referenced as ``trainee`` in the sections
52
52
Global Feature Importance
53
53
^^^^^^^^^^^^^^^^^^^^^^^^^
54
54
To get global feature importance metrics, :py:meth:`Trainee.react_aggregate`, is called on a trained and analyzed Trainee. :py:meth:`Trainee.react_aggregate` calls react internally on the cases already trained into the Trainee and calculates the metrics. In this method, the desired metrics can be selected as parameters. These parameters are named individually
55
-
in the ``details`` parameter and setting them to ``True`` will calculate and return the desired metrics. For example, ``feature_mda_robust`` and ``feature_contributions_robust`` will calculate the robust versions of MDA and Feature Contributions, while ``feature_mda_full`` and ``feature_contributions_full`` will calculate the non-robust (full) versions.
56
-
An action feature must be specified. ``feature_influences_action_feature`` is recommended for feature influence metrics such as feature contributions and mda, especially when used in conjunction with retrieving prediction stats, however, ``action_feature`` can be also used as well. ``action_feature`` sets the action feature for both influence metrics and prediction stats. Since often
55
+
in the ``details`` parameter and setting them to ``True`` will calculate and return the desired metrics. For example, ``feature_robust_accuracy_contributions`` and ``feature_robust_prediction_contributions`` will calculate the robust versions of Accuracy Contributions and Prediction Contributions, while ``feature_full_accuracy_contributions`` and ``feature_full_prediction_contributions`` will calculate the non-robust (full) versions.
56
+
An action feature must be specified. ``feature_influences_action_feature`` is recommended for feature influence metrics such as prediction contributions and accuracy contributions, especially when used in conjunction with retrieving prediction stats, however, ``action_feature`` can be also used as well. ``action_feature`` sets the action feature for both influence metrics and prediction stats. Since often
57
57
only the influence metrics's action feature is intended to be set, ``feature_influences_action_feature`` provides a more precise parameter.
To get local feature importance metrics, :py:meth:`Trainee.react`, is first called on a trained and analyzed Trainee. In this method, the desired metrics, ``feature_contributions_robust`` and ``feature_mda_robust``, can be selected as inputs to the ``details`` parameters as key value pairs from a dictionary. These parameters are named individually
75
+
To get local feature importance metrics, :py:meth:`Trainee.react`, is first called on a trained and analyzed Trainee. In this method, the desired metrics, ``feature_robust_prediction_contributions`` and ``feature_robust_accuracy_contributions``, can be selected as inputs to the ``details`` parameters as key value pairs from a dictionary. These parameters are named individually
76
76
and setting them to ``True`` will calculate the desired metrics. Robust calculations are performed by default.
77
77
78
78
.. code-block:: python
79
79
80
80
details = {
81
-
'feature_contributions_robust':True,
82
-
'feature_mda_robust':True,
81
+
'feature_robust_prediction_contributions':True,
82
+
'feature_robust_accuracy_contributions':True,
83
83
}
84
84
85
85
results = trainee.react(
@@ -94,31 +94,14 @@ are calculated in :py:meth:`Trainee.react` from the previous step.
Contributions and MDA are also metrics for cases and not just features, so please be aware when reading other guides that may use those terms.
103
+
Accuracy and Prediction Contributions are also metrics for cases and not just features, so please be aware when reading other guides that may use those terms.
104
104
105
-
Contribution and MDA matrices
106
-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
107
-
108
-
Howso also provides the two metrics in a matrix view, where for each row which represent the action feature, you can identify the contributions of all
109
-
the other context features to that prediction. Since these matrices may not be symmetrical, examining the differences between the upper and lower triangular matrices
110
-
may reveal additional insights. Please see the linked recipe for more information.
111
-
112
-
:meth:`Trainee.get_contribution_matrix` and :meth:`Trainee.get_mda_matrix` gets these matrices respectively.
0 commit comments