Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the initial prototype of a DatabaseLoader #4035

Merged
merged 12 commits into from
Jul 31, 2019
Merged

Adding the initial prototype of a DatabaseLoader #4035

merged 12 commits into from
Jul 31, 2019

Conversation

tannergooding
Copy link
Member

This adds the initial barebones prototype for DatabaseLoader to the Microsot.ML.Experimental project.

@tannergooding
Copy link
Member Author

CC. @eerhardt

using System.Runtime.CompilerServices;
using Microsoft.ML;

[assembly: InternalsVisibleTo(assemblyName: "Microsoft.ML.Tests" + PublicKey.TestValue)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing this, we should change the test to use the public APIs. Basically new DatabaseLoader => mlContext.Data.CreateDatabaseLoader

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this?


In reply to: 308308500 [](ancestors = 308308500)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the only comment I have left on this PR. If this is resolved, and all the builds pass, I think we will be ready to merge this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. (sorry forgot to update the comment)

This is still needed for converting from the DbType to the InternalKind to the public DataKind.

{
var mlContext = new MLContext(seed: 1);

var connectionString = @"Server=(localdb)\mssqllocaldb;Database=EFGetStarted.AspNetCore.NewDb;Trusted_Connection=True;ConnectRetryCount=0";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will either need to disable this test, or change it so it doesn't rely on machine state/configuration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a MockDbConnection, MockDbCommand, and MockDbDataReader class which wrap a TextLoader. I would guess this needs some cleanup to support more testing scenarios and to match w/e conventions ML.NET is currently using.

"SELECT SepalLength, SepalWidth, PetalLength, PetalWidth, Label FROM IrisData",
connection))
{
DatabaseLoader loader = new DatabaseLoader(mlContext, new DatabaseLoader.Options()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to that way using a Columns array, when possible, can we have another test (and implementation if not ready yet) that is using a convenient data-model class?

}
});

IDataView trainingData = loader.Load(() => command.ExecuteReader());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we have this foundational API we also need to add it to the MLContext catalog, so having convenient methods such as:

mlContext.Data.LoadFromDbDataReader<ModelInputData>(() => command.ExecuteReader());

or

IDataView trainingDataView = mlContext.Data.LoadFromDbTable<ModelInputData, SqlConnection>(connString: myConnString, 
                                                                                                 tableName: "TrainingDataTable"); 

@CESARDELATORRE
Copy link
Contributor

I put a couple of comments but those asks don't need to be implemented before merging this first version. It could come later. :)

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's get this in and iterate on it.

@eerhardt eerhardt merged commit 9c2de1e into dotnet:master Jul 31, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants