-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Channels await fix #5313
Channels await fix #5313
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5313 +/- ##
===========================================
+ Coverage 73.80% 87.68% +13.87%
===========================================
Files 1022 244 -778
Lines 190617 44759 -145858
Branches 20488 1900 -18588
===========================================
- Hits 140686 39247 -101439
+ Misses 44411 5167 -39244
+ Partials 5520 345 -5175
Flags with carried forward coverage won't be shown. Click here to find out more. |
Marked as ready for review, though still looking at how to add a test for this. |
while (_toConsumeChannel.Reader.WaitToReadAsync().GetAwaiter().GetResult()) | ||
var waitToReadAwaiter = _toConsumeChannel.Reader.WaitToReadAsync().GetAwaiter(); | ||
|
||
while (waitToReadAwaiter.IsCompleted && waitToReadAwaiter.GetResult()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment here stating that the check to verify the operation is complete is necessary before calling GetResult()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dear sir, I have same error with :
var option = new SdcaMaximumEntropyMulticlassTrainer.Options()
{
FeatureColumnName = "SearchModel",
LabelColumnName = "Label",
NumberOfThreads = 1
};
var maximumEntropy = ctx.MulticlassClassification.Trainers.SdcaMaximumEntropy(option);
var pipeline =ctx.Transforms.Conversion.MapValueToKey("Label", "BestDictMatchModelText")
.Append(ctx.Transforms.Text.FeaturizeText(inputColumnName: "VEHICLE_MODEL",
outputColumnName: "SearchModel"))
.Append(maximumEntropy)
.Append(ctx.Transforms.Conversion.MapKeyToValue("PredictedLabel"));
var part1 = decoratedWithFuzzyList;//.Take(1000);
var dataReader = ctx.Data.LoadFromEnumerable(part1);
ThisModel = pipeline.Fit(dataReader);
So, when part1 contains more than a 1000 items, Fit throws:
at System.Threading.Channels.AsyncOperation.ThrowIncompleteOperationException()
at System.Threading.Channels.AsyncOperation1.GetResult(Int16 token) at Microsoft.ML.Transforms.RowShufflingTransformer.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Trainers.TrainingCursorBase.MoveNext() at Microsoft.ML.Trainers.SdcaTrainerBase
3.TrainCore(IChannel ch, RoleMappedData data, LinearModelParameters predictor, Int32 weightSetCount)
at Microsoft.ML.Trainers.StochasticTrainerBase2.TrainModelCore(TrainContext context) at Microsoft.ML.Trainers.TrainerEstimatorBase
2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor)
at Microsoft.ML.Trainers.TrainerEstimatorBase2.Fit(IDataView input) at Microsoft.ML.Data.EstimatorChain
1.Fit(IDataView input)
Can you please consider to fix this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have this same issue with version 1.5.0. I can fit 1046 rows in the shuffled set. 1047 rows will give the The asynchronous operation has not completed
exception.
Tested with the daily 1.5.2 package, and now the error is gone.
It sounds like the samples was able to repro the issue: dotnet/machinelearning-samples#833 |
while (_toConsumeChannel.Reader.WaitToReadAsync().GetAwaiter().GetResult()) | ||
var waitToReadAwaiter = _toConsumeChannel.Reader.WaitToReadAsync().GetAwaiter(); | ||
|
||
while (waitToReadAwaiter.IsCompleted && waitToReadAwaiter.GetResult()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is undo-ing what we were trying to avoid with #5123 (comment), which is just spinning a thread seeing if there is something to read.
I was mistaken that you could block synchronously on a Channel. Sorry about that.
What do you think about going back to what you had previously with just calling TryRead
, and if that returns false
, the only thing I can think of is to Thread.Sleep
for some amount of time, even 1 ms
, just to stop the thread from consuming a whole core spinning waiting for the producer. It would probably be a good idea to try it multiple ways and see what perf results you get.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @eerhardt! Made an update for this. Just let me know if it's still not correct.
Since this has broken projects in the past, it would be really good to get a test for this to make sure it doesn't break in the future. |
@eerhardt Added a small scenario based on the taxi fare sample. Verified in a separate solution that it throws the error and running the test in this branch runs successfully. Definitely let me know if any changes are needed. Thanks for the help with this! |
@@ -185,6 +185,49 @@ public static string DownloadTensorFlowSentimentModel() | |||
return path; | |||
} | |||
|
|||
public static string DownloadTaxiFareData() | |||
{ | |||
string githubPath = "https://raw.githubusercontent.com/dotnet/machinelearning-samples/master/samples/csharp/getting-started/Regression_TaxiFarePrediction/TaxiFarePrediction/Data/taxi-fare-train.csv"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have this file in the repo. See
var trainDataPath = GetDataPath("taxi-fare-train.csv"); |
machinelearning/test/Microsoft.ML.Tests/FeatureContributionTests.cs
Lines 388 to 417 in 37edde9
public class TaxiTrip | |
{ | |
[LoadColumn(0)] | |
public string VendorId; | |
[LoadColumn(1)] | |
public float RateCode; | |
[LoadColumn(2)] | |
public float PassengerCount; | |
[LoadColumn(3)] | |
public float TripTime; | |
[LoadColumn(4)] | |
public float TripDistance; | |
[LoadColumn(5)] | |
public string PaymentType; | |
[LoadColumn(6)] | |
public float FareAmount; | |
} | |
/// <summary> | |
/// Returns a DataView with a Features column which include HotEncodedData | |
/// </summary> | |
private IDataView GetOneHotEncodedData(int numberOfInstances = 100) | |
{ | |
var trainDataPath = GetDataPath("taxi-fare-train.csv"); |
We shouldn't need to download the file again.
@@ -0,0 +1,40 @@ | |||
using Xunit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All code files should have a copyright at the top.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for fixing this @jwood803.
Thanks again for the help @eerhardt! Sorry for breaking it, but glad I can help with the fix. |
Fix for #5312
Setting as a draft to add a test.