Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoML Add Recommendation Task #4246

Merged
merged 33 commits into from
Oct 17, 2019

Conversation

LittleLittleCloud
Copy link
Contributor

@LittleLittleCloud LittleLittleCloud commented Sep 25, 2019

What's already be done in this PR

  • added Recommendation task and experiment in AutoML
  • added MatrixFactorization as MatrixFactorizationExtension
  • added a new Column Purpose (LabelFeature) and it's corresponding TransformerExtension (LabelCategorical) so that AutoML can construct the pre-process pipeline for MatrixFactorizationExtension correctly
  • added a new recommendation example (with rating only) in AutoML.Example, and you can play with that!

What's need to be done (Feel Free to CRUD)

  • figuring out how to accelerate and properly presenting the training process. Seems that MatrixFactorization requires more time to train a round, and the algorithm for sweeping params requires to train many rounds to find out the best parameter. It's time costy and customers might not like that.
  • Corresponding CodeGen part
  • Test case!
  • Better Naming and code style
  • Enable support for multiple feature trainers in AutoML (it requires some refactor works and shouldn't be done in this PR. But it's important)

LittleLittleCloud and others added 10 commits August 25, 2019 13:07
…odeGen (dotnet#4043)

* add CodeGen Library

* rename namespace to ML.CodeGen.*

* cancel delay sign in CodeGen

* update based on comment

* remove useless nuget package

* add ComsumeModel class

* use consumeModel in CodeGen

* use different annotation for different target

* target to netstandard2.0

* remove console output

* adjust output result

* remove useless variable

* update CodeGen name to CodeGenerator

* rebase to latest branch

* fix bug in Normalize function

* fix test

* rename features to featuresList

* move enum out of class

* remove useless items

* remove NLog.config

* update generated CSharp file

* change wording and delete useless file

* using Uppercase in comment
@eerhardt
Copy link
Member

using System;

All product code needs a copyright.


Refers to: src/Microsoft.ML.AutoML/API/RecommendationExperiment.cs:1 in 9695ffe. [](commit_id = 9695ffe, deletion_comment = False)

@eerhardt
Copy link
Member

eerhardt commented Sep 25, 2019

mlnet.csproj cannot go into master yet.


Refers to: src/mlnet/mlnet.csproj:1 in 9695ffe. [](commit_id = 9695ffe, deletion_comment = False)

@@ -13,6 +13,7 @@ internal enum ColumnPurpose
TextFeature = 4,
Weight = 5,
ImagePath = 6,
SamplingKey = 7
SamplingKey = 7,
LabelFeature, // CategoricalFeature that requires ValueToKey converter, better naming?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd want to use HashToKey (name be off) instead of the mentioned ValueToKey as the ValueToKey will map future unseen values to NA in your test dataset; and as a lesser issue is slow by taking a full pass of the dataset.

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good, @maryamariyan. Thanks for the great work!

LGTM

@maryamariyan maryamariyan merged commit ee8418a into dotnet:master Oct 17, 2019
maryamariyan pushed a commit to maryamariyan/machinelearning that referenced this pull request Oct 21, 2019
Trains Recommendation models able to predict rating for existing users
 Conflicts:
	pkg/Microsoft.ML.AutoML/Microsoft.ML.AutoML.nupkgproj
	src/Microsoft.ML.AutoML/Microsoft.ML.AutoML.csproj
	test/Microsoft.ML.AutoML.Tests/AutoFitTests.cs
	test/Microsoft.ML.AutoML.Tests/ColumnInferenceTests.cs
	test/Microsoft.ML.AutoML.Tests/ColumnInformationUtilTests.cs
	test/Microsoft.ML.AutoML.Tests/Microsoft.ML.AutoML.Tests.csproj
	test/Microsoft.ML.AutoML.Tests/TrainerExtensionsTests.cs
	test/Microsoft.ML.AutoML.Tests/TransformInferenceTests.cs
	test/Microsoft.ML.AutoML.Tests/UserInputValidationTests.cs
frank-dong-ms-zz pushed a commit to frank-dong-ms-zz/machinelearning that referenced this pull request Nov 4, 2019
Trains Recommendation models able to predict rating for existing users
@ghost ghost locked as resolved and limited conversation to collaborators Mar 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants