Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Execution Timeout Expired" on Large SQL Dataset #2759

Open
shainey0816 opened this issue Aug 16, 2023 · 5 comments
Open

"Execution Timeout Expired" on Large SQL Dataset #2759

shainey0816 opened this issue Aug 16, 2023 · 5 comments
Labels
GS Tutorial Bug from customer using Getting Started Tutorial Stale

Comments

@shainey0816
Copy link

  • Model Builder or CLI Version:
    17.17.0.2332602

  • Visual Studio Version (if applicable):
    Microsoft Visual Studio Community 2022 (64-bit) - Current
    Version 17.5.4

  • On which page of the tutorial did you run into an issue (URL): N/A this is a bug in the Model Builder itself.

  • Clear description of the problem:

Screen capture 1 shows a generic timeout error when attempting to connect to SQL data source. This only occurs on very large datasets. For example, the timeout occurring on the desired data training set has 3936051 records in the table. I can switch to a table with a smaller number of records and everything loads just fine. I don't see a way to change the timeout time other than in the Advanced Settings when connecting to the database, but I changed the Initialization >> Command Timeout value to 3600 and it still errors within the same amount of time so that doesn't appear to affect this timeout.

Screen capture 2 shows that I can go into the "Advanced data options" and manually set my label in the UI, but then throws an error when saving. Screen capture 3 shows the selected label was saved regardless.

Screen capture 4 shows that I can select a different table from the database with smaller number of records and it loads just fine to proceed to the next steps.

To summarize the problem:

  1. Using SQL database as source with very large dataset fails to load
  2. "Data Preview, Processing data...." runs indefinitely
  3. "Next step" button never enables due to these issues, so I can never finish the model unless I use a much smaller dataset which is not desirable for my application

Steps to reproduce the behavior:

  1. Select value prediction scenario
  2. Select local training environment
  3. Input >> SQL Server
  4. Data source >> Microsoft SQL Server
  5. Authenticate server and select database
  6. Select table with very large dataset (almost 4 million records in my case) and wait for the following error typically after ~30 seconds: "Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding."
  7. Select a different table with smaller dataset to verify the connection is still active and working pointing to the large dataset as the issue

Expected behavior
The expected behavior is that "Data Preview" doesn't timeout that soon, that the preview of the large dataset is loaded, that I can then select from the "Column to predict (Label)" dropdown list which column should be the label, and that the "Next step" button enables after selecting the label.

Screen capture 1

Capture1

Full stack trace:

at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.SqlCommand.InternalEndExecuteReader(IAsyncResult asyncResult, String endMethod, Boolean isInternal) at System.Data.SqlClient.SqlCommand.EndExecuteReaderInternal(IAsyncResult asyncResult) at System.Data.SqlClient.SqlCommand.EndExecuteReaderAsync(IAsyncResult asyncResult) at System.Threading.Tasks.TaskFactory1.FromAsyncCoreLogic(IAsyncResult iar, Func2 endFunction, Action1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ML.ModelBuilder.DataProviders.SQLDataProvider.d__15.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ML.ModelBuilder.ToolWindows.DataTextViewModel.d__146.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ML.ModelBuilder.ToolWindows.TextDataControl.<>c__DisplayClass4_0.<b__0>d.MoveNext()

Screen capture 2

Capture2

Full stack trace:
at System.ThrowHelper.ThrowArgumentNullException(ExceptionArgument argument)
at System.Collections.Generic.Dictionary2.FindEntry(TKey key) at System.Collections.Generic.Dictionary2.TryGetValue(TKey key, TValue& value)
at Microsoft.ML.ModelBuilder.Utils.Utilities.GetTrainValidateDataSplitRatio(String ratio)
at Microsoft.ML.ModelBuilder.ViewModels.AdvancedDataOptionsViewModel.d__64.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ML.ModelBuilder.ToolWindows.AdvancedDataOptions.AdvancedDataCleaningDialog.<>c__DisplayClass3_0.<<SaveButton_Click>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.VisualStudio.Threading.JoinableTask.CompleteOnCurrentThread()
at Microsoft.ML.ModelBuilder.ToolWindows.AdvancedDataOptions.AdvancedDataCleaningDialog.SaveButton_Click(Object sender, RoutedEventArgs e)

Screen capture 3

Capture3

Screen capture 4

Capture4

Additional context
Consistently reproduceable following the steps above. It only occurs trying to use the tables with many 100k's of records.

@shainey0816 shainey0816 added the GS Tutorial Bug from customer using Getting Started Tutorial label Aug 16, 2023
@LittleLittleCloud
Copy link
Contributor

@shainey0816 I wonder if command timeout is correctly set in model builder, would you mind sharing your connection string(withouth sensitive data) in modelbuilder with us? You can find it in .mbconfig

also @zewditu

@shainey0816
Copy link
Author

There doesn't appear to be too much in the connection string from model builder. See below.

Data Source=mydatabase.database.windows.net;Initial Catalog=sffc-sns;Persist Security Info=True;User ID=myUser

I've tried changing the Command Timeout property to a value greater than the default 30 doesn't produce different results. Still times out around ~30 seconds regardless of what value I use.

@zewditu
Copy link
Contributor

zewditu commented Sep 5, 2023

@shainey0816 after you change your command timeout property to some other value greater than 30 seconds and save your Model builder file could you share your connection string from model builder?
It should appear in your connection string.

@zewditu
Copy link
Contributor

zewditu commented Sep 5, 2023

From Model Builder Side we might need to improve the performance of some of our methods like GetSqlTableRowCountAsync

@github-actions github-actions bot added the Stale label Oct 6, 2023
@nikhilguda
Copy link

I'm facing the same issue. I checked the database query logs and I can see that the query was going on when Git Actions said it timed out. A new session is created to execute the error block. Once the new session is completed, the old session is killed too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GS Tutorial Bug from customer using Getting Started Tutorial Stale
Projects
None yet
Development

No branches or pull requests

4 participants