Import-DbaCsv, map correct types for BulkCopy #9479

niphlod · 2024-09-26T22:46:18Z

Type of Change

Purpose

Enlarge the usecase of Import-DbaCsv.
Right now it's either going to never fail (creating a table of all nvarchar(max)) or it will as soon as the table has some not-that-weird types (bit or guid, for example).
This should also greatly speed-up loads to existing tables without users needing to "stage" to a nvarchar(max) table and then copying over to the destination table.
I'm guessing it also helps without even counting the "stage-then-move" bit: loading a pre-existing table with the correct types should be faster than loading to a nvarchar(max) table.

Approach

It's long, and windy, but I made the code redundant and verbose (hoperfully readable and obvious as well) rather than short and speedy. My 2c here is that it's a bit of code that is rather convoluted and it's better to be explicit more than anything. Cost (as in milliseconds spent) on adding the proper machinery is minimal if you take into account loading even just 1k rows.

I'm unsure about approaching the "get the types from the table" using SMO (like this version) or if it'd be better to just use a quicker query going directly to system tables to get "name, datatype" resultset and use that for the mapping. Pro of that is that we don't open another connection, because at that point of the function one has already been established and a transaction is pending, so re-using $server for Get-DbaDbTable ends up in errors.

There's another bit that's missing which is enabling this on a csv that has no headers... not sure if it's needed by users but if so the code gets longer because we can't let lumen do "quite a bit of work" (the $reader.GetFieldHeaders() call)

That being said, @potatoqualitee is the original creator and MAY have an opinionated view on how to tackle that.
@petervandivier : I got it working for most of my tables, but you may want to give it a whirl and maybe test extensively with any type you can throw at it.

Commands to test

The one on the report is good, short and simple

Invoke-DbaQuery `
    -SqlInstance localhost `
    -Database tempdb `
    -Query "create table foo ([Guid] uniqueidentifier);"

New-Guid | Export-Csv ./foo.csv 

Import-DbaCsv `
    -SqlInstance localhost `
    -Database tempdb `
    -Table foo `
    -Path ./foo.csv `
    -SingleColumn

NB: this is a first PROPOSAL. If this approach is "accepted" I'm going to add a few tests for it in new commits.

niphlod · 2024-09-26T22:54:12Z

public/Import-DbaCsv.ps1

+                            $tableDef = Get-DbaDbTable $instance -SqlCredential $SqlCredential -Database $Database -Table $table -Schema $schema
+
+                            if ($tableDef.Count -ne 1) {
+                                Stop-Function -Message "Could not fetch table definition for table $table in schema $schema" -ErrorRecord $_


I'm unsure if this is the right thing to do here ... at this point I don't see any branches in code that could allow reaching here and not having a table present, but ... "what if"?

I'd leave it cuz we encounter "what if" more than we'd expect.

potatoqualitee · 2024-09-27T08:01:33Z

My only opinion is that it's wonderful to have this addressed 😁 Thank you! I'll re-run whatever is failing to see if we can get it all green.

potatoqualitee · 2024-09-27T08:02:42Z

Oh, oops. it's a failure on Import-DbaCsv, @niphlod

niphlod · 2024-09-27T08:08:17Z

Oh, oops. it's a failure on Import-DbaCsv, @niphlod

yeah, I know. did need some sleep yesterday and hadn't time to look into it.
If you like the implementation I can go ahead and iron out bits and pieces here and there to make it work as it is, and I'll try to do that even when headers are missing

potatoqualitee · 2024-09-27T12:19:09Z

Yes! Thank you 🙏🏼

tests/Import-DbaCsv.Tests.ps1

potatoqualitee · 2024-09-30T07:53:19Z

This is so good, @niphlod ! thank you. WIll merge when andreas approves

andreasjordan

Looks good. added some minor suggestions.

andreasjordan · 2024-09-30T07:57:04Z

public/Import-DbaCsv.ps1

+                                # we do not use $server because the connection is active here
+                                $tableDef = Get-TableDefinitionFromInfoSchema -table $table -schema $schema -sqlconn $sqlconn
+                                if ($tableDef.Length -eq 0) {
+                                    Stop-Function -Message "Could not fetch table definition for table $table in schema $schema" -ErrorRecord $_


I think -ErrorRecord $_ should only be used in a catch block. Here we don't have an error.

I knew I was missing something

andreasjordan · 2024-09-30T07:58:31Z

public/Import-DbaCsv.ps1

+                                # start by getting the table definition
+                                $tableDef = Get-TableDefinitionFromInfoSchema -table $table -schema $schema -sqlconn $sqlconn
+                                if ($tableDef.Length -eq 0) {
+                                    Stop-Function -Message "Could not fetch table definition for table $table in schema $schema" -ErrorRecord $_


Remove -ErrorRecord $_

andreasjordan · 2024-09-30T08:01:24Z

public/Import-DbaCsv.ps1

@@ -663,8 +830,8 @@ function Import-DbaCsv {
                        $bulkCopy.Add_SqlRowsCopied( {
                                $script:totalRowsCopied += (Get-AdjustedTotalRowsCopied -ReportedRowsCopied $args[1].RowsCopied -PreviousRowsCopied $script:prevRowsCopied).NewRowCountAdded

-                                $tstamp = $(Get-Date -Format 'yyyyMMddHHmmss')
-                                Write-Message -Level Verbose -Message "[$tstamp] The bulk copy library reported RowsCopied = $($args[1].RowsCopied). The previous RowsCopied = $($script:prevRowsCopied). The adjusted total rows copied = $($script:totalRowsCopied)"
+                                #Write-Message -Level Verbose -FunctionName "Import-DbaCsv" -Message " The bulk copy library reported RowsCopied = $($args[1].RowsCopied). The previous RowsCopied = $($script:prevRowsCopied). The adjusted total rows copied = $($script:totalRowsCopied)"


Should be removed.

yep. BTW, found instances of the same logging which results in a "bad-looking" message so once this goes in I'll replace other occurrences through our functions as well

public/Import-DbaCsv.ps1

wsmelton · 2024-10-02T12:41:09Z

public/Import-DbaCsv.ps1

+                $sqlconn
+            )
+
+            $query = "SELECT c.COLUMN_NAME, c.DATA_TYPE, c.ORDINAL_POSITION - 1 FROM INFORMATION_SCHEMA.COLUMNS AS c WHERE TABLE_SCHEMA = @schema AND TABLE_NAME = @table;"


Any chance this will ever get math error on the ordinal position?

The ExecuteReader() should be in a try/catch just in case it can/does.

ORDINAL_POSITION always starts as 1. I'll wrap executereader in a bit

Co-authored-by: Shawn Melton <11204251+wsmelton@users.noreply.github.com>

potatoqualitee · 2024-10-05T17:51:24Z

hell yeah, thanks everyone! 🔥

Import-DbaCsv, map correct types for BulkCopy

1f94011

niphlod mentioned this pull request Sep 26, 2024

DbaCsv needs support for GUID target columns #9433

Closed

niphlod added 2 commits September 27, 2024 00:51

removal of rogue Write-Host

8ccff6a

improved error message

0ac4109

niphlod commented Sep 26, 2024

View reviewed changes

niphlod added 3 commits September 28, 2024 00:16

Import-DbaCsv, fixed some cases even with no header (do Csv)

219fe34

Import-DbaCsv, fix tests (do Csv)

f44a3f1

Import-DbaCsv tests, maybe it's the path (do Csv)

dd5827f

andreasjordan reviewed Sep 28, 2024

View reviewed changes

tests/Import-DbaCsv.Tests.ps1 Outdated Show resolved Hide resolved

Import-DbaCsv Tests, add -NoTypeInformation

f616f32

andreasjordan reviewed Sep 30, 2024

View reviewed changes

Import-DbaCsv, last fixes

901f603

andreasjordan approved these changes Sep 30, 2024

View reviewed changes

wsmelton reviewed Oct 2, 2024

View reviewed changes

public/Import-DbaCsv.ps1 Outdated Show resolved Hide resolved

wsmelton reviewed Oct 2, 2024

View reviewed changes

niphlod and others added 3 commits October 2, 2024 14:44

Update public/Import-DbaCsv.ps1

738a2c1

Co-authored-by: Shawn Melton <11204251+wsmelton@users.noreply.github.com>

Import-DbaCsv, wrap in try/catch

287c2bf

Import-DbaCsv, rename function

601c211

potatoqualitee merged commit 6025137 into dataplat:development Oct 5, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Import-DbaCsv, map correct types for BulkCopy #9479

Import-DbaCsv, map correct types for BulkCopy #9479

niphlod commented Sep 26, 2024 •

edited

Loading

niphlod Sep 26, 2024 •

edited

Loading

potatoqualitee Sep 27, 2024

potatoqualitee commented Sep 27, 2024

potatoqualitee commented Sep 27, 2024

niphlod commented Sep 27, 2024

potatoqualitee commented Sep 27, 2024

potatoqualitee commented Sep 30, 2024

andreasjordan left a comment

andreasjordan Sep 30, 2024

niphlod Sep 30, 2024

andreasjordan Sep 30, 2024

andreasjordan Sep 30, 2024

niphlod Sep 30, 2024

wsmelton Oct 2, 2024

niphlod Oct 2, 2024

potatoqualitee commented Oct 5, 2024

Import-DbaCsv, map correct types for BulkCopy #9479

Import-DbaCsv, map correct types for BulkCopy #9479

Conversation

niphlod commented Sep 26, 2024 • edited Loading

Type of Change

Purpose

Approach

Commands to test

niphlod Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

potatoqualitee commented Sep 27, 2024

potatoqualitee commented Sep 27, 2024

niphlod commented Sep 27, 2024

potatoqualitee commented Sep 27, 2024

potatoqualitee commented Sep 30, 2024

andreasjordan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

potatoqualitee commented Oct 5, 2024

niphlod commented Sep 26, 2024 •

edited

Loading

niphlod Sep 26, 2024 •

edited

Loading