-
Notifications
You must be signed in to change notification settings - Fork 65
[ML] Return total SHAP per feature as a new result type #1387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
10e5a6c
code commit
valeriy42 df13ed0
Unit test added
valeriy42 874140c
changelog updated
valeriy42 6f40db9
unit test updated
valeriy42 38a180a
use accumulate
valeriy42 08bd4ec
solution with unique ptr compiles
valeriy42 13357af
cleaning up
valeriy42 3f7ec0a
total importance mean variance
valeriy42 0ba97a0
total importance mean variance min max
valeriy42 30c109b
remove variance
valeriy42 f7689c3
Merge branch 'total-shap' of https://github.com/valeriy42/ml-cpp into…
valeriy42 d3758d4
Merge branch 'master' of https://github.com/elastic/ml-cpp into total…
valeriy42 8700f5f
Fixing unit tests
valeriy42 6fe6399
cleaning up
valeriy42 f8126c8
multiclass format change
valeriy42 1672019
change result format for binary classification
valeriy42 3f8f6c2
Unit tests extended
valeriy42 1c3cfaf
remove const_cast
valeriy42 1452f28
fix test failure
valeriy42 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
/* | ||
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
* or more contributor license agreements. Licensed under the Elastic License; | ||
* you may not use this file except in compliance with the Elastic License. | ||
*/ | ||
#ifndef INCLUDED_ml_api_CInferenceModelMetadata_h | ||
#define INCLUDED_ml_api_CInferenceModelMetadata_h | ||
|
||
#include <maths/CBasicStatistics.h> | ||
#include <maths/CLinearAlgebraEigen.h> | ||
|
||
#include <api/CInferenceModelDefinition.h> | ||
#include <api/ImportExport.h> | ||
|
||
#include <string> | ||
|
||
namespace ml { | ||
namespace api { | ||
|
||
//! \brief Class controls the serialization of the model meta information | ||
//! (such as totol feature importance) into JSON format. | ||
class API_EXPORT CInferenceModelMetadata { | ||
public: | ||
static const std::string JSON_CLASS_NAME_TAG; | ||
static const std::string JSON_CLASSES_TAG; | ||
static const std::string JSON_FEATURE_NAME_TAG; | ||
static const std::string JSON_IMPORTANCE_TAG; | ||
static const std::string JSON_MAX_TAG; | ||
static const std::string JSON_MEAN_MAGNITUDE_TAG; | ||
static const std::string JSON_MIN_TAG; | ||
static const std::string JSON_MODEL_METADATA_TAG; | ||
static const std::string JSON_TOTAL_FEATURE_IMPORTANCE_TAG; | ||
|
||
public: | ||
using TVector = maths::CDenseVector<double>; | ||
using TStrVec = std::vector<std::string>; | ||
using TRapidJsonWriter = core::CRapidJsonConcurrentLineWriter; | ||
|
||
public: | ||
//! Writes metadata using \p writer. | ||
void write(TRapidJsonWriter& writer) const; | ||
void columnNames(const TStrVec& columnNames); | ||
void classValues(const TStrVec& classValues); | ||
const std::string& typeString() const; | ||
//! Add importances \p values to the feature with index \p i to calculate total feature importance. | ||
//! Total feature importance is the mean of the magnitudes of importances for individual data points. | ||
void addToFeatureImportance(std::size_t i, const TVector& values); | ||
|
||
private: | ||
using TMeanVarAccumulator = maths::CBasicStatistics::SSampleMeanVar<TVector>::TAccumulator; | ||
using TMinMaxAccumulator = std::vector<maths::CBasicStatistics::CMinMax<double>>; | ||
using TSizeMeanVarAccumulatorUMap = std::unordered_map<std::size_t, TMeanVarAccumulator>; | ||
using TSizeMinMaxAccumulatorUMap = std::unordered_map<std::size_t, TMinMaxAccumulator>; | ||
|
||
private: | ||
void writeTotalFeatureImportance(TRapidJsonWriter& writer) const; | ||
|
||
private: | ||
TSizeMeanVarAccumulatorUMap m_TotalShapValuesMeanVar; | ||
TSizeMinMaxAccumulatorUMap m_TotalShapValuesMinMax; | ||
TStrVec m_ColumnNames; | ||
TStrVec m_ClassValues; | ||
}; | ||
} | ||
} | ||
|
||
#endif //INCLUDED_ml_api_CInferenceModelMetadata_h |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.