Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding basic support for handling vectors in the database loader. #4138

Merged
merged 15 commits into from
Aug 30, 2019
Merged

Adding basic support for handling vectors in the database loader. #4138

merged 15 commits into from
Aug 30, 2019

Conversation

tannergooding
Copy link
Member

No description provided.

@tannergooding tannergooding marked this pull request as ready for review August 29, 2019 16:28
@tannergooding tannergooding requested a review from a team as a code owner August 29, 2019 16:28
@tannergooding
Copy link
Member Author

This should be ready for review now.


var mockProviderFactory = new MockProviderFactory(mlContext, loader);
var databaseSource = new DatabaseSource(mockProviderFactory, connectionString, commandText);
var providerFactory = DbProviderFactories.GetFactory("System.Data.SqlClient");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use SqlClientFactory.Instance instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just using DbProviderFactories as that seemed to be the normal way to get registered factories and more closely matched what users wanted.

get => throw new NotImplementedException();
set => throw new NotImplementedException();
var databaseFile = GetTestDatabasePath(databaseName);
return $@"Data Source=(LocalDB)\MSSQLLocalDB;AttachDbFilename={databaseFile};Database={databaseName};Integrated Security=True";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this work on non-Windows machines?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't test, but I believe so.

Linux supports SQL as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your CI is failing with

System.PlatformNotSupportedException : LocalDB is not supported on this platform.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to disable for Linux for the time being and can log an issue for adding a mysql database as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is also failing for MacOS. It would be unfortunate to not have any unix tests for the DatabaseLoader....


In reply to: 319267186 [](ancestors = 319267186)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but I'm not sure I'll get a MySQL database together before this needs to be in for the next preview.

When is that cutoff?

ctx.Writer.WriteBoolByte(info.SourceIndex.HasValue);
if (info.SourceIndex.HasValue)
ctx.Writer.Write(info.SourceIndex.GetValueOrDefault());
ctx.Writer.Write(info.Segments.Length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

info.Segments can be null, right? (According to ColInfo's constructor, it can be. It has an AssertValueOrNull line in it).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as TextLoader, so if there is a bug here, there is a bug there as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a bug there as well.

That is not true.

private ColInfo(string name, DataViewType colType, Segment[] segs, int isegVar, int sizeBase)
{
Contracts.AssertNonEmpty(name);
Contracts.AssertNonEmpty(segs);

TextLoader guarantees that Segment[] segs is a non empty array. DatabaseLoader allows it to be null.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to be ctx.Writer.Write((info.Segments?.Length).GetValueOrDefault());

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the very next line going to NullRef?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have a test that saves and loads a DatabaseLoader into a model.

@@ -67,6 +67,12 @@ public static Type ToType(this DbType dbType)
}
}

/// <summary>Maps a <see cref="DataKind"/> to the associated <see cref="DbType"/>.</summary>
public static DbType ToDbType(this DataKind dataKind)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't used anymore, is it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used in CreateDatabaseLoader<TInput> when processing the memberInfo kind.

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thanks for the nice work. It will make customers lives easier to consume many columns from a Database.

@eerhardt
Copy link
Member

@tannergooding - half of the Windows legs are failing on your new tests with an exception of the form:

System.Data.SqlClient.SqlException : Connection Timeout Expired. The timeout period elapsed while attempting to consume the pre-login handshake acknowledgement. This could be because the pre-login handshake failed or the server was unable to respond back in time. The duration spent while attempting to connect to this server was - [Pre-Login] initialization=33909; handshake=28; 
---- System.ComponentModel.Win32Exception : The wait operation timed out

@tannergooding
Copy link
Member Author

Added a connection timeout to try and give localdb a chance to initialize if it hadn't been previously

@eerhardt
Copy link
Member

Merging. The failing leg is failing due to an AutoML test, which is unrelated to this work.

@eerhardt eerhardt merged commit de36f6d into dotnet:master Aug 30, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants