Skip to content

Re-enabling MatrixFactorizationSimpleTrainAndPredict() #4846

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

mstfbl
Copy link
Contributor

@mstfbl mstfbl commented Feb 14, 2020

Re-enabling disabled test MatrixFactorizationSimpleTrainAndPredict(). The test was failing due to tolerance value being too small for Linux builds.

@mstfbl mstfbl requested a review from a team as a code owner February 14, 2020 23:31
@mstfbl
Copy link
Contributor Author

mstfbl commented Feb 14, 2020

Currently running this test 1000 times to confirm the new tolerance values of 1e-4 for Linux builds.

@mstfbl
Copy link
Contributor Author

mstfbl commented Feb 15, 2020

Linux tests are passing with 1e-4. One build (Windows_x64_NetCoreApp30Release_build) is failing during clean up stage, which is not related to MatrixFactorizationSimpleTrainAndPredict().

double tolerance = Math.Pow(10, -7);
double windowsTolerance = Math.Pow(10, -7);
// 1e-7 is too small for Linux, so we try 1e-4
double linuxTolerance = Math.Pow(10, -4);
if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
{
// Linux case
var expectedUnixL2Error = 0.614457914950479; // Linux baseline
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

614457914950479 [](start = 44, length = 15)

I tried several run, below is my test result:

CenterOS:
Expected: 0.614457914950479 +- 1e-7
Passed: 0.6144579149504698
Failed: 0.6144409592507726

Ubuntu:
Expected: 0.614457914950479 +- 1e-7
Passed: 0.61445791495047
Failed: 0.614440959250773

builds:
https://dev.azure.com/dnceng/public/_build/results?buildId=521052&view=logs&j=887d69fe-3a72-535d-d1ef-c8cd8825499f&t=22ac88eb-9536-588d-9093-3c6614becdd0
https://dev.azure.com/dnceng/public/_build/results?buildId=521051&view=logs&j=5aa5c7df-492a-5eaf-973a-62b7b0f0ee3b&t=ffdbd604-f3e2-5332-cf61-c8dd00799b47
https://dev.azure.com/dnceng/public/_build/results?buildId=521050&view=logs&j=28859320-f5de-51e0-1fd2-7bea8c11cf7a
https://dev.azure.com/dnceng/public/_build/results?buildId=521049&view=logs&j=ac97a0ee-ee90-5a55-af17-a5a589fa545f

I'm not sure how the tolerance was set as before but seems we do have a certain failed value that like not caused by numbrical randomness, can you investigate further the root cause?

Copy link
Contributor

@frank-dong-ms-zz frank-dong-ms-zz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

@mstfbl
Copy link
Contributor Author

mstfbl commented Feb 21, 2020

Closing this PR as I am activating this and three other tests with a fix in another PR.

@mstfbl mstfbl closed this Feb 21, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Mar 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants