-
Notifications
You must be signed in to change notification settings - Fork 1.9k
AutoML Add Recommendation Task #4246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
maryamariyan
merged 33 commits into
dotnet:master
from
LittleLittleCloud:u/xiaoyun/recommendation
Oct 17, 2019
Merged
Changes from all commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
6f3d26c
[AutoML] Pull out Code Gen as separate library plus some changes in C…
LittleLittleCloud 7e0f6d0
pack codegen into mlnet
LittleLittleCloud 22edabb
pack codegen into mlnet (#4179)
LittleLittleCloud 50e0dcd
Merge branch 'features/automl' of https://github.com/dotnet/machinele…
LittleLittleCloud 09c56f7
add MatrixFactorization Trainer
LittleLittleCloud 15c58f1
add RecommendationExperiment and other functions
LittleLittleCloud ac57d9a
some refactor in MatrixFactorization, plus fix small bugs
LittleLittleCloud c07948f
add LabelFeautre ColumnPurpose and some update
LittleLittleCloud f182a20
Merge branch 'u/xiaoyun/recommendation'
LittleLittleCloud 9695ffe
add missing Native dll
LittleLittleCloud b54de14
remove mlnet project
LittleLittleCloud 913b4af
update based on comment
LittleLittleCloud 3fc520c
update example
LittleLittleCloud 2f47c02
Merge branch 'master' into u/xiaoyun/recommendation
maryamariyan c78efbf
nit: code style
maryamariyan 5864b78
- Rename RecommendationExperimentScenario.MF to RecommendationExperim…
maryamariyan 4010d90
nit: code style/ add space between if and (
maryamariyan fef926e
Fix compile error
maryamariyan 9c4852c
minor fixes
maryamariyan 74cbc5c
First stage changes
maryamariyan 7e7c272
change signature for ITrainerEstimator
maryamariyan 17500cf
Adding tests, checking code coverage
maryamariyan b882ee1
cleanup + improve SweepParams, taken from MatrixFactorizationTrainer
maryamariyan d7a272d
Address PR feedback - part1
maryamariyan b69d9c3
Apply PR feedbacks - Part 2
maryamariyan f9c6abb
Update test to reflect change made to sweep params
maryamariyan 7d856c8
Apply PR feedbacks: Part 3
maryamariyan 7852c5e
Adds more sweepable params and test
maryamariyan f889fa5
Rename to UserId/ItemId
maryamariyan 2ec0649
Rename User/Item ID: part 2
maryamariyan c39ae94
- Removing SamplingKey for first iteration
maryamariyan 7186280
Apply review comments
maryamariyan d3d6b4a
Minor rename
maryamariyan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
20 changes: 20 additions & 0 deletions
20
docs/samples/Microsoft.ML.AutoML.Samples/DataStructures/Movie.cs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
// Licensed to the .NET Foundation under one or more agreements. | ||
// The .NET Foundation licenses this file to you under the MIT license. | ||
// See the LICENSE file in the project root for more information. | ||
|
||
using Microsoft.ML.Data; | ||
|
||
namespace Microsoft.ML.AutoML.Samples.DataStructures | ||
{ | ||
public class Movie | ||
{ | ||
[LoadColumn(0)] | ||
public string UserId; | ||
|
||
[LoadColumn(1)] | ||
public string MovieId; | ||
|
||
[LoadColumn(2)] | ||
public float Rating; | ||
} | ||
} |
14 changes: 14 additions & 0 deletions
14
docs/samples/Microsoft.ML.AutoML.Samples/DataStructures/MovieRatingPrediction.cs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
// Licensed to the .NET Foundation under one or more agreements. | ||
// The .NET Foundation licenses this file to you under the MIT license. | ||
// See the LICENSE file in the project root for more information. | ||
|
||
using Microsoft.ML.Data; | ||
|
||
namespace Microsoft.ML.AutoML.Samples | ||
{ | ||
public class MovieRatingPrediction | ||
{ | ||
[ColumnName("Score")] | ||
public float Rating; | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
92 changes: 92 additions & 0 deletions
92
docs/samples/Microsoft.ML.AutoML.Samples/RecommendationExperiment.cs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
// Licensed to the .NET Foundation under one or more agreements. | ||
// The .NET Foundation licenses this file to you under the MIT license. | ||
// See the LICENSE file in the project root for more information. | ||
|
||
using System; | ||
using System.IO; | ||
using System.Linq; | ||
using Microsoft.ML.AutoML.Samples.DataStructures; | ||
using Microsoft.ML.Data; | ||
|
||
namespace Microsoft.ML.AutoML.Samples | ||
{ | ||
public static class RecommendationExperiment | ||
{ | ||
private static string TrainDataPath = "<Path to your train dataset goes here>"; | ||
private static string TestDataPath = "<Path to your test dataset goes here>"; | ||
private static string ModelPath = @"<Desired model output directory goes here>\Model.zip"; | ||
private static string LabelColumnName = "Rating"; | ||
private static string UserColumnName = "UserId"; | ||
private static string ItemColumnName = "MovieId"; | ||
private static uint ExperimentTime = 60; | ||
|
||
public static void Run() | ||
{ | ||
MLContext mlContext = new MLContext(); | ||
|
||
// STEP 1: Load data | ||
IDataView trainDataView = mlContext.Data.LoadFromTextFile<Movie>(TrainDataPath, hasHeader: true, separatorChar: ','); | ||
IDataView testDataView = mlContext.Data.LoadFromTextFile<Movie>(TestDataPath, hasHeader: true, separatorChar: ','); | ||
|
||
// STEP 2: Run AutoML experiment | ||
Console.WriteLine($"Running AutoML recommendation experiment for {ExperimentTime} seconds..."); | ||
ExperimentResult<RegressionMetrics> experimentResult = mlContext.Auto() | ||
.CreateRecommendationExperiment(new RecommendationExperimentSettings() { MaxExperimentTimeInSeconds = ExperimentTime }) | ||
.Execute(trainDataView, testDataView, | ||
new ColumnInformation() | ||
{ | ||
LabelColumnName = LabelColumnName, | ||
UserIdColumnName = UserColumnName, | ||
ItemIdColumnName = ItemColumnName | ||
}); | ||
|
||
// STEP 3: Print metric from best model | ||
RunDetail<RegressionMetrics> bestRun = experimentResult.BestRun; | ||
maryamariyan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Console.WriteLine($"Total models produced: {experimentResult.RunDetails.Count()}"); | ||
Console.WriteLine($"Best model's trainer: {bestRun.TrainerName}"); | ||
Console.WriteLine($"Metrics of best model from validation data --"); | ||
PrintMetrics(bestRun.ValidationMetrics); | ||
|
||
// STEP 5: Evaluate test data | ||
IDataView testDataViewWithBestScore = bestRun.Model.Transform(testDataView); | ||
RegressionMetrics testMetrics = mlContext.Recommendation().Evaluate(testDataViewWithBestScore, labelColumnName: LabelColumnName); | ||
Console.WriteLine($"Metrics of best model on test data --"); | ||
PrintMetrics(testMetrics); | ||
|
||
// STEP 6: Save the best model for later deployment and inferencing | ||
mlContext.Model.Save(bestRun.Model, trainDataView.Schema, ModelPath); | ||
|
||
// STEP 7: Create prediction engine from the best trained model | ||
var predictionEngine = mlContext.Model.CreatePredictionEngine<Movie, MovieRatingPrediction>(bestRun.Model); | ||
|
||
// STEP 8: Initialize a new test, and get the prediction | ||
var testMovie = new Movie | ||
{ | ||
UserId = "1", | ||
MovieId = "1097", | ||
}; | ||
var prediction = predictionEngine.Predict(testMovie); | ||
Console.WriteLine($"Predicted rating for: {prediction.Rating}"); | ||
|
||
// Only predict for existing users | ||
testMovie = new Movie | ||
{ | ||
UserId = "612", // new user | ||
MovieId = "2940" | ||
}; | ||
prediction = predictionEngine.Predict(testMovie); | ||
Console.WriteLine($"Expected Rating NaN for unknown user, Predicted: {prediction.Rating}"); | ||
|
||
Console.WriteLine("Press any key to continue..."); | ||
Console.ReadKey(); | ||
} | ||
|
||
private static void PrintMetrics(RegressionMetrics metrics) | ||
{ | ||
Console.WriteLine($"MeanAbsoluteError: {metrics.MeanAbsoluteError}"); | ||
Console.WriteLine($"MeanSquaredError: {metrics.MeanSquaredError}"); | ||
Console.WriteLine($"RootMeanSquaredError: {metrics.RootMeanSquaredError}"); | ||
Console.WriteLine($"RSquared: {metrics.RSquared}"); | ||
} | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
// Licensed to the .NET Foundation under one or more agreements. | ||
// The .NET Foundation licenses this file to you under the MIT license. | ||
// See the LICENSE file in the project root for more information. | ||
|
||
using System; | ||
using System.Collections.Generic; | ||
using System.Linq; | ||
using Microsoft.ML.Data; | ||
|
||
namespace Microsoft.ML.AutoML | ||
{ | ||
/// <summary> | ||
/// Settings for AutoML experiments on recommendation datasets. | ||
/// </summary> | ||
public sealed class RecommendationExperimentSettings : ExperimentSettings | ||
maryamariyan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{ | ||
/// <summary> | ||
/// Metric that AutoML will try to optimize over the course of the experiment. | ||
/// </summary> | ||
/// <value>The default value is <see cref="RegressionMetric.RSquared"/>.</value> | ||
public RegressionMetric OptimizingMetric { get; set; } | ||
|
||
/// <summary> | ||
/// Collection of trainers the AutoML experiment can leverage. | ||
/// </summary> | ||
/// <value>The default value is a collection auto-populated with all possible trainers (all values of <see cref="RecommendationTrainer" />).</value> | ||
public ICollection<RecommendationTrainer> Trainers { get; } | ||
|
||
/// <summary> | ||
/// Initializes a new instance of <see cref="RecommendationExperimentSettings"/>. | ||
/// </summary> | ||
public RecommendationExperimentSettings() | ||
{ | ||
OptimizingMetric = RegressionMetric.RSquared; | ||
maryamariyan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Trainers = Enum.GetValues(typeof(RecommendationTrainer)).OfType<RecommendationTrainer>().ToList(); | ||
} | ||
} | ||
|
||
/// <summary> | ||
/// Enumeration of ML.NET recommendation trainers used by AutoML. | ||
/// </summary> | ||
public enum RecommendationTrainer | ||
{ | ||
MatrixFactorization | ||
} | ||
|
||
/// <summary> | ||
/// AutoML experiment on recommendation datasets. | ||
/// </summary> | ||
/// <example> | ||
/// <format type="text/markdown"> | ||
/// <] | ||
/// ]]></format> | ||
/// </example> | ||
public sealed class RecommendationExperiment : ExperimentBase<RegressionMetrics, RecommendationExperimentSettings> | ||
maryamariyan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{ | ||
internal RecommendationExperiment(MLContext context, RecommendationExperimentSettings settings) | ||
: base(context, | ||
new RegressionMetricsAgent(context, settings.OptimizingMetric), | ||
new OptimizingMetricInfo(settings.OptimizingMetric), | ||
settings, | ||
TaskKind.Recommendation, | ||
TrainerExtensionUtil.GetTrainerNames(settings.Trainers)) | ||
{ | ||
} | ||
|
||
private protected override CrossValidationRunDetail<RegressionMetrics> GetBestCrossValRun(IEnumerable<CrossValidationRunDetail<RegressionMetrics>> results) | ||
{ | ||
return BestResultUtil.GetBestRun(results, MetricsAgent, OptimizingMetricInfo.IsMaximizing); | ||
} | ||
|
||
private protected override RunDetail<RegressionMetrics> GetBestRun(IEnumerable<RunDetail<RegressionMetrics>> results) | ||
{ | ||
return BestResultUtil.GetBestRun(results, MetricsAgent, OptimizingMetricInfo.IsMaximizing); | ||
} | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,5 +9,6 @@ internal enum TaskKind | |
BinaryClassification, | ||
MulticlassClassification, | ||
Regression, | ||
Recommendation | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.