Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added in new MissingValueReplacing method. #5205

Merged
merged 8 commits into from
Jun 10, 2020

Conversation

michaelgsharp
Copy link
Member

Adds in the missing values replacing method of Mode. Replaces missing values with the most frequent value in a column. In the case that multiple values have the same count, the first one encountered is the one that is returned.

This also moves a test helping method from OnnxConverstionTest.cs into the BaseTestBaseline class so that every test class can use it.

@michaelgsharp michaelgsharp requested review from harishsk and a team June 3, 2020 22:27
@michaelgsharp michaelgsharp self-assigned this Jun 3, 2020
@codecov
Copy link

codecov bot commented Jun 4, 2020

Codecov Report

Merging #5205 into master will increase coverage by 0.49%.
The diff coverage is 96.12%.

@@            Coverage Diff             @@
##           master    #5205      +/-   ##
==========================================
+ Coverage   73.08%   73.57%   +0.49%     
==========================================
  Files        1004     1016      +12     
  Lines      187398   190214    +2816     
  Branches    20212    20456     +244     
==========================================
+ Hits       136952   139952    +3000     
+ Misses      44929    44687     -242     
- Partials     5517     5575      +58     
Flag Coverage Δ
#Debug 73.57% <96.12%> (+0.49%) ⬆️
#production 69.37% <91.73%> (+0.49%) ⬆️
#test 87.53% <100.00%> (+0.30%) ⬆️
Impacted Files Coverage Δ
...c/Microsoft.ML.Transforms/MissingValueReplacing.cs 77.53% <ø> (+0.17%) ⬆️
...rosoft.ML.Transforms/MissingValueReplacingUtils.cs 54.15% <91.73%> (+15.20%) ⬆️
...est/Microsoft.ML.TestFramework/BaseTestBaseline.cs 77.23% <100.00%> (+4.53%) ⬆️
test/Microsoft.ML.Tests/OnnxConversionTest.cs 96.62% <100.00%> (-0.19%) ⬇️
.../Microsoft.ML.Tests/Transformers/NAReplaceTests.cs 100.00% <100.00%> (ø)
....ML.AutoML/PipelineSuggesters/PipelineSuggester.cs 83.19% <0.00%> (-3.37%) ⬇️
src/Microsoft.ML.AutoML/Sweepers/Parameters.cs 84.32% <0.00%> (-0.85%) ⬇️
...c/Microsoft.ML.SamplesUtils/SamplesDatasetUtils.cs 40.00% <0.00%> (-0.68%) ⬇️
...soft.ML.Data/DataLoadSave/Text/TextLoaderCursor.cs 89.29% <0.00%> (-0.16%) ⬇️
....ML.Tests/Transformers/CountTargetEncodingTests.cs 100.00% <0.00%> (ø)
... and 39 more

Copy link
Contributor

@harishsk harishsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Contributor

@harishsk harishsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

@harishsk
Copy link
Contributor

harishsk commented Jun 5, 2020

            Append(mlContext.Transforms.NormalizeMinMax("Features")).

Can you please add a separate onnx test for ReplaceMissingValues with all the supported types of replacements?


Refers to: test/Microsoft.ML.Tests/OnnxConversionTest.cs:581 in 701d9d8. [](commit_id = 701d9d8, deletion_comment = False)

Copy link
Contributor

@harishsk harishsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

/// <summary>
/// Replace with the most frequent value of the column.
/// </summary>
Mode = 5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we skip 4 here? It went from 0, 1, 2, 3 and then jumped to 5.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants