Adding the initial prototype of a DatabaseLoader#4035
Adding the initial prototype of a DatabaseLoader#4035eerhardt merged 12 commits intodotnet:masterfrom tannergooding:DatabaseLoader
Conversation
Infer source index if not specified. Add initial public API. Fix bug in GetIdGetter
|
CC. @eerhardt |
| using System.Runtime.CompilerServices; | ||
| using Microsoft.ML; | ||
|
|
||
| [assembly: InternalsVisibleTo(assemblyName: "Microsoft.ML.Tests" + PublicKey.TestValue)] |
There was a problem hiding this comment.
Instead of doing this, we should change the test to use the public APIs. Basically new DatabaseLoader => mlContext.Data.CreateDatabaseLoader
There was a problem hiding this comment.
There was a problem hiding this comment.
I think this is the only comment I have left on this PR. If this is resolved, and all the builds pass, I think we will be ready to merge this.
There was a problem hiding this comment.
Yes. (sorry forgot to update the comment)
This is still needed for converting from the DbType to the InternalKind to the public DataKind.
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoaderCursor.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoader.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
We will either need to disable this test, or change it so it doesn't rely on machine state/configuration.
There was a problem hiding this comment.
I created a MockDbConnection, MockDbCommand, and MockDbDataReader class which wrap a TextLoader. I would guess this needs some cleanup to support more testing scenarios and to match w/e conventions ML.NET is currently using.
There was a problem hiding this comment.
In addition to that way using a Columns array, when possible, can we have another test (and implementation if not ready yet) that is using a convenient data-model class?
There was a problem hiding this comment.
Once we have this foundational API we also need to add it to the MLContext catalog, so having convenient methods such as:
mlContext.Data.LoadFromDbDataReader<ModelInputData>(() => command.ExecuteReader());
or
IDataView trainingDataView = mlContext.Data.LoadFromDbTable<ModelInputData, SqlConnection>(connString: myConnString,
tableName: "TrainingDataTable");
|
I put a couple of comments but those asks don't need to be implemented before merging this first version. It could come later. :) |
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoaderCursor.cs
Show resolved
Hide resolved
eerhardt
left a comment
There was a problem hiding this comment.
LGTM. Let's get this in and iterate on it.
This adds the initial barebones prototype for DatabaseLoader to the Microsot.ML.Experimental project.