Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use GetRandomFileName when creating random temp folder to avoid conflict #5229

Merged
merged 5 commits into from
Jun 11, 2020

Conversation

frank-dong-ms-zz
Copy link
Contributor

address issue: #5210

We are create random temp folder during model load, sometimes there are file name conflict when load models in multi-threading as the random temp folder name is not seeded.

@frank-dong-ms-zz frank-dong-ms-zz requested a review from a team as a code owner June 11, 2020 04:14
Copy link
Contributor

@mstfbl mstfbl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between the default seed of Random() and Environment.TickCount?
I see that RandomUtils.Create() uses Random() as its default seed, and the default seed of Random() is "... is derived from the system clock, which has finite resolution" (source). Could Random() (and thereby RandomUtils.Create() already not be using Environment.TickCount as its default seed?

public static TauswortheHybrid Create()
{
        // Seed from a system random.
        return new TauswortheHybrid(new Random());
}

@frank-dong-ms-zz
Copy link
Contributor Author

frank-dong-ms-zz commented Jun 11, 2020

I think the behavior of default seed of Random is not guaranteed, it depends on net framework or net core, also different version might have different behavior.
Then I realized in most case the default seed of Random class is indeed system time, so in some cases if the context switch is really fast then 2 threads might get same random seed thus generate same temp path. So I changed the seed to hash of a new Guid which should avoid similar conflict.


In reply to: 428586963 [](ancestors = 428586963)

Copy link
Contributor

@mstfbl mstfbl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks Frank! :shipit:

@codecov
Copy link

codecov bot commented Jun 11, 2020

Codecov Report

Merging #5229 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #5229   +/-   ##
=======================================
  Coverage   73.47%   73.47%           
=======================================
  Files        1010     1010           
  Lines      187988   187974   -14     
  Branches    20262    20261    -1     
=======================================
+ Hits       138118   138120    +2     
+ Misses      44385    44373   -12     
+ Partials     5485     5481    -4     
Flag Coverage Δ
#Debug 73.47% <100.00%> (+<0.01%) ⬆️
#production 69.29% <100.00%> (+<0.01%) ⬆️
#test 87.38% <ø> (ø)
Impacted Files Coverage Δ
src/Microsoft.ML.Core/Data/Repository.cs 79.77% <100.00%> (+0.05%) ⬆️
src/Microsoft.ML.Maml/MAML.cs 24.75% <0.00%> (-1.46%) ⬇️
...soft.ML.Data/DataLoadSave/Text/TextLoaderCursor.cs 89.45% <0.00%> (+0.15%) ⬆️
src/Microsoft.ML.AutoML/Sweepers/Parameters.cs 85.16% <0.00%> (+0.84%) ⬆️
....ML.AutoML/PipelineSuggesters/PipelineSuggester.cs 86.55% <0.00%> (+3.36%) ⬆️
...rosoft.ML.AutoML/ColumnInference/TextFileSample.cs 65.56% <0.00%> (+5.96%) ⬆️

@@ -118,7 +118,7 @@ internal Repository(bool needDir, IExceptionContext ectx)

private static string GetShortTempDir()
{
var rnd = RandomUtils.Create();
var rnd = RandomUtils.Create(Guid.NewGuid().GetHashCode());
string path;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an associated bug? Are we seeing an issue here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this PR is to fix below issue:
#5210


In reply to: 438920139 [](ancestors = 438920139)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❔ Why not just use Path.GetRandomFileName()?

Copy link
Member

@sharwell sharwell Jun 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire implementation of GetShortTempDir() can be reduced to this:

var path = Path.Combine(Path.GetFullPath(Path.GetTempPath()), "ml_tests", Path.GetRandomFileName());
Directory.CreateDirectory(path);
return path;

The ml_tests subdirectory allows a user to easily identify and delete stray folders created by tests. You definitely don't want to place the folders directly in %TEMP% which is what happens today.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please use Path.GetRandomFileName(). The original code had RandomUtils.Create, but we have seen it cause conflicts when it comes to file names. We have started replacing those with Path.GetRandomFileName()


In reply to: 438940202 [](ancestors = 438940202)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't have repro, user who reported this issue also don't have repro on his local env, he get error from their product environment but not able to provide a repro


In reply to: 438942892 [](ancestors = 438942892,438936814,438920139)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that Path.GetRandomFileName()is indeed superior at generating file and/or directory names, so I'm in favor of using this too. It does seem to be limited to generating just 11 characters, though this will probably be more than enough.

It also seems interesting that here Path.GetTempPath() has been added, but Path.GetRandomFileName() wasn't. I wonder if there's a reason why Path.GetRandomFileName()` wasn't used in the first place.

Thanks @sharwell and @harishsk for the heads up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that make sense


In reply to: 438942324 [](ancestors = 438942324,438940202)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is great idea, one little modification is I want to change "ml_tests" to "ml_dotnet" as this is not for test usage.


In reply to: 438941863 [](ancestors = 438941863)

Copy link
Contributor Author

@frank-dong-ms-zz frank-dong-ms-zz Jun 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checked the implementation again, Path.GetRandomFileName() is already added at line 239 by Jon Wood at 1/31/2020 at below PR:
#4645

so consider this PR as code cleanup


In reply to: 438971881 [](ancestors = 438971881,438941863)

Copy link
Contributor

@harishsk harishsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@frank-dong-ms-zz frank-dong-ms-zz merged commit 95b58cc into dotnet:master Jun 11, 2020
@harishsk harishsk changed the title add random seed when create random temp folder to avoid conflict Use GetRandomFileName when creating random temp folder to avoid conflict Jul 10, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Mar 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants