Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find dba orphaned file recurse #6846

Closed
wants to merge 27 commits into from
Closed

Find dba orphaned file recurse #6846

wants to merge 27 commits into from

Conversation

nmbell
Copy link

@nmbell nmbell commented Sep 16, 2020

Type of Change

  • Bug fix (non-breaking change, fixes # )
  • New feature (non-breaking change, adds functionality, fixes # )
  • Breaking change (effects multiple commands or functionality, fixes # )
  • Ran manual Pester test and has passed (`.\tests\manual.pester.ps1)
  • Adding code coverage to existing functionality
  • Pester test is included
  • If new file reference added for test, has is been added to github.com/sqlcollaborative/appveyor-lab ?
  • Nunit test is included
  • Documentation
  • Build system

Purpose

Add a new parameter, -Recurse, to Find-DbaOrphanedFile that will enable recursive search of
directories specified in the -Path parameter.

Approach

Update the logic to control the value of the 'depth' parameter passed to xp_dirtree when recursion is required (i.e. 0).
Also update the SQL query to construct the full paths from the results returned.

Commands to test

Find-DbaOrphanedFile

Add a new parameter, -Recurse, that will enable recursive search of
directories. This is useful in situations where an instance has a
number of databases with a common root directory e.g.:
    D:\DataFiles\MyFirstDb\data.mdf
    D:\DataFiles\MySecondDb\data.mdf
The user can then call the function with:
    ...-Path 'D:\Datafiles' -Recurse...
rather than having to explicitly derive the directory tree
under 'D:\Datafiles'.
- Move file type filtering into sql query:
      reduce the size of the returned data set;
      eliminate the $matching variable
- Move system file filtering into sql query:
      reduce the size of the returned data set;
      simplify later logic
- Rename variables for clarity:
      $paths > $sqlpaths;
      $filestructure > $sqlfiles;
      .Comparison > .ComparisonPath
- Add ComparisonPath property to results from Get-SqlFileStructure
  to hold the validated path:
      eliminate the $valid variable
- Use dotted notation in the final foreach loop:
      eliminate the $dirtreematcher variable
- Pipe SQL query results straight into $dirtreefiles:
      eliminate the $datatable variable
- Make more use of the pipeline
- Put LocalOnly and RemoteOnly into different parameter sets
  to discourage both being used at the same time (order of processing
  means only LocalOnly will be recognized)
- Add comments for clarity
Update tests to exercise -Path and -Recursive parameter functionality
Update logic to work with all paths in UNC format so that local and
UNC references to the same file will match successfully. Filename and
RemoteFilename values in output are unaffected. This commit includes
a new internal function, Split-AdminUnc, which splits a UNC file
path into server name and local path; i.e. the inverse of the
Join-AdminUnc function.
And add explicit references to the FullName property. Hopefully this
will resolve the build failure. Also replace the alias
Detach-DbaDatabase with Dismount-DbaDatabase, which is easier to
find on https://dbatools.io/commands/.
@potatoqualitee
Copy link
Member

Thanks so much for the PR, @nmbell ! Unfortunately, the test failed and I couldn't immediately identify the cause :/

@nmbell
Copy link
Author

nmbell commented Sep 18, 2020 via email

Assign a UNC path based on the machine name that the SQL instance is
running on rather than the build machine. Also, for consistency,
remove the orphaned files using the remote name.
In order to test that file matching occurs correctly between
UNC and local paths for the same file.
@nmbell
Copy link
Author

nmbell commented Sep 19, 2020

OK, I'm stumped. If anyone could provide any insight as to why the build is failing, it'd be much appreciated:

https://ci.appveyor.com/project/sqlcollaborative/dbatools/builds/35288490/job/nf9eu0kyhl3wi3la
https://ci.appveyor.com/project/sqlcollaborative/dbatools/builds/35288714/job/in5lk63ss48fsiea
Build is failing because either the database isn't getting created, or the function isn't finding the files. I've run the function on my laptop, my workstation, and the QA server at work, both as it would normally be used and by running the test line-by-line in the console, and it gives me exactly the expected results, no errors.

https://ci.appveyor.com/project/sqlcollaborative/dbatools/builds/35288714/job/mqux8564ukuiyrm2
"Find-DbaOrphanedFile.ps1 is not compliant with the OTBS formatting style. Please run Invoke-DbatoolsFormatter against the failing file and commit the changes."
I've run Invoke-DbatoolsFormatter against the file but I'm still getting this error and I'm not sure why. I've run Invoke-ScriptAnalyzer -Settings CodeFormattingOTBS but it doesn't return anything.

@wsmelton
Copy link
Member

Can the test be run against your test environment?

You can create a constants.local.ps1 in the root directory of the repo and set those instances to your environment and then our tests use those instances.

Generally if the test is successful locally it will run successful in appveyor...generally.

@wsmelton
Copy link
Member

According to the output in the link the tests database is trying to create the log file over the admin share...

Running C:\github\dbatools\tests\Find-DbaOrphanedFile.Tests.ps1

64CREATE DATABASE [dbatoolsci_orphanedfile] ON PRIMARY (NAME = [dbatoolsci_orphanedfile_DATA], FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.SQL2016\MSSQL\DATA\dbatoolsci_orphanedfile_DATA.mdf') LOG ON (NAME = [dbatoolsci_orphanedfile_TLOG], FILENAME = '\\APPVYR-WIN\C$\Program Files\Microsoft SQL Server\MSSQL13.SQL2016\MSSQL\DATA\dbatoolsci_orphanedfile_TLOG.ldf')

65Running C:\github\dbatools\tests\Find-DbaStoredProcedure.Tests.ps1

Add a line to the dbatools.Tests.ps1 file to expose the modules
that have been imported into the session. This is to aid in
troubleshooting and configuration of local test environments.
Revert integration test to its original form to see if the new logic
has caused a regression per the original test.
@niphlod
Copy link
Contributor

niphlod commented Sep 21, 2020

tests are a bit funky but let's go over the changes first .... why should the instance enum the admin shares ? there are shops with servers that have no admin shares available, and this will surely break.
So, IMHO:

  • no need for the whole comparisonpath logic
  • no need for split-adminunc at all
  • mssql should enum the dirs as the instance sees them, if they're local, they are local and they don't have to be translated to an unc path (also because all your translations are done "hoping" that admin shares are actually available)

PS: Join-AdminUNC has been created to make life to users easier if they DO have admin shares available, but it's not something our functions use outside of the realm of the PS session: in this case MSSQL can't be assured to be able to enumerate it and, sincerely, I don't find a reason why you changed the defaults from the fullpath to the admin-share-representation-of-it pretty much everywhere.

EDIT: also, using the CTE breaks usability with MSSQL 2000

@niphlod
Copy link
Contributor

niphlod commented Sep 22, 2020

@nmbell : give me a shout if you can't figure it out, I streamlined your code and it all seems to work as it should, tests too.

@nmbell
Copy link
Author

nmbell commented Sep 22, 2020

In SSMS, if I run:

USE [master]
GO
CREATE DATABASE [dbatoolsci_orphanedfile] ON
PRIMARY (NAME = [dbatoolsci_orphanedfile_DATA], FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.SQL2017DEVELOPER\MSSQL\DATA\dbatoolsci_orphanedfile_DATA.mdf')
LOG ON (NAME = [dbatoolsci_orphanedfile_TLOG], FILENAME = '\\MYCOMPUTER\C$\Program Files\Microsoft SQL Server\MSSQL14.SQL2017DEVELOPER\MSSQL\DATA\dbatoolsci_orphanedfile_TLOG.ldf')

then:

USE [dbatoolsci_orphanedfile]
GO
SELECT [name], state_desc FROM sys.databases WHERE database_id = DB_ID()
SELECT physical_name FROM [dbatoolsci_orphanedfile].sys.database_files

I get:

image

Then running the original version of Find-DbaOrphanedFiles:

Find-DbaOrphanedFile -SqlInstance 'MYCOMPUTER\SQL2017DEVELOPER'

Returns the following:

ComputerName : MYCOMPUTER
InstanceName : SQL2017DEVELOPER
SqlInstance : MYCOMPUTER\SQL2017DEVELOPER
Filename : C:\Program Files\Microsoft SQL Server\MSSQL14.SQL2017DEVELOPER\MSSQL\DATA\dbatoolsci_orphanedfile_TLOG.ldf
RemoteFilename : \\MYCOMPUTER\C$\Program Files\Microsoft SQL Server\MSSQL14.SQL2017DEVELOPER\MSSQL\DATA\dbatoolsci_orphanedfile_TLOG.ldf

which is incorrect. Why is this? In the function, the calls to Get-SqlDefaultPaths returns only a local path. The comparison is only matching the paths as text strings rather than e.g. using Get-Item to compare the file objects (which I imagine would create a huge performance penalty). The file paths need to be translated to some common format (which I believe was also the point of the Comparison (now ComparisonPath) property). I suppose I could have gone the other way and converted everything to local paths, but UNC's are, well, universal.

When you say 'admin shares', are you thinking specifically of folders that have been setup as a 'network share'? I don't know about older servers, but on every machine I've worked on (Windows Server 2008 R2 I think is the oldest version for me), you can refer to any file or folder with its UNC path, regardless of whether it's available via an explicit 'shared folder' or not. Once the function has gathered the file paths (in whatever form they are returned), they are thereafter only treated as strings. There is no need to hope the shares are available, because it's irrelevant. The user has the choice of working with either Filename or RemoteFilename in the output as suits their needs.

It's a fair point if the function is no longer compatible with SQL 2000, however my goal is only to present working code for consideration. If the project administrators decide that backward compatibility is more valuable than the new functionality, then fair enough. I could only hope that at some point in the future the scales would tip the other way.

@nmbell
Copy link
Author

nmbell commented Sep 22, 2020

I fixed the issue with the build failing due to formatting. I needed to downgrade my version of PSScriptAnalyzer from 1.19.1 to 1.18.2. I tried running the tests locally with Pester 5.0.4 and PSScriptAnalyzer 1.19.1 (latest versions at time of writing) and got the following formatting errors (edited for clarity):

PS>& 'D:\Users\...\Documents\WindowsPowerShell\Modules\dbatools\tests\dbatools.Tests.ps1'
Running D:\Users\...\Documents\WindowsPowerShell\Modules\dbatools\tests\dbatools.Tests.ps1
Starting discovery in 1 files.
Running D:\Users\...\Documents\WindowsPowerShell\Modules\dbatools\tests\dbatools.Tests.ps1
Discovery finished in 217.94s.
[-] dbatools style.formatting.Get-DbaDbFile.ps1 is not compliant with the OTBS formatting style. Please run I...
[-] dbatools style.formatting.Get-DbaFilestream.ps1 is not compliant with the OTBS formatting style. Please r...
[-] dbatools style.formatting.Get-DbaPfAvailableCounter.ps1 is not compliant with the OTBS formatting style. ...
[-] dbatools style.formatting.Test-DbaDeprecatedFeature.ps1 is not compliant with the OTBS formatting style. ...
[-] dbatools style.formatting.Test-DbaDiskAlignment.ps1 is not compliant with the OTBS formatting style. Plea...
[-] dbatools style.formatting.Set-FileSystemSetting.ps1 is not compliant with the OTBS formatting style. Plea...
[-] dbatools style.formatting.Update-ServiceStatus.ps1 is not compliant with the OTBS formatting style. Pleas...

It seems like the newer version of PSScriptAnalyzer would break the build if it was upgraded on the build server. @potatoqualitee: just thinking, rather than failing the build because of incorrect formatting, what about applying formatting after a successful build? Could you trigger it off commits to the main development branch? That way contributors don't have to worry about whether a difference in their local version will break the build off their pull request commits, and the clonable/distributable code base always complies with whichever version is running on the build server.

@niphlod
Copy link
Contributor

niphlod commented Sep 22, 2020

When you say 'admin shares', are you thinking specifically of folders that have been setup as a 'network share'? I don't know about older servers, but on every machine I've worked on (Windows Server 2008 R2 I think is the oldest version for me), you can refer to any file or folder with its UNC path, regardless of whether it's available via an explicit 'shared folder' or not. Once the function has gathered the file paths (in whatever form they are returned), they are thereafter only treated as strings. There is no need to hope the shares are available, because it's irrelevant. The user has the choice of working with either Filename or RemoteFilename in the output as suits their needs.

Meddling with HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters\AutoShareWks or simply not being an administrator on the server renders \\computername\diskname$ unusable (hello, hardening).

There is no need to hope the shares are available, because it's irrelevant

Huh oh... They absolutely need to be available (and that was the point of my previous comment) !

I get you wanted an universal representation of a path, but you are making MSSQL use \\computername\diskname$ to enumerate dirs, which it simply cannot do on all servers: let's suppose master is on D:\SQLData dir, you're invoking xp_dirtree \\computername\d$\SQLData. Not only it doesn't work always (for hardening but also in other cases, like disjoint domains.... a better implementation would enumerate on localhost)....it's really not needed at all (or doesn't seems so from the code you added).
Again, I really not see the added benefit of forcing all paths in the code to assume the \\computername\disk$ format over just local plain fullpaths .... if you can name one, this is the time :D

It's a fair point if the function is no longer compatible with SQL 2000, however my goal is only to present working code for consideration. If the project administrators decide that backward compatibility is more valuable than the new functionality, then fair enough. I could only hope that at some point in the future the scales would tip the other way.

If you look at the function, yes, we do support SQL2000. The new feature is cool, but doesn't strictly need a cte to work: it's perfectly achievable without.

@potatoqualitee
Copy link
Member

Thanks everyone for your participation 💯 ! I appreciate the attention to detail and agree with @niphlod's suggestions. I also like the idea of auto-formatting on behalf of the user, but I'm not sure leaving GitHub Actions/additional commits to fix it is ideal. @niphlod your thoughts?

@niphlod
Copy link
Contributor

niphlod commented Sep 22, 2020

about formatting: the point is always the same....assure a consistent formatting throughout the module. Everyone has their IDEs and not every IDE is settable to format-on-save on a single style.

We just leverage the one from Invoke-ScriptAnalyzer because THAT one is consistent :D
I'm not sure if a pre-commit-hook can be set on git to avoid pushing unformatted files.

Anywayyyyy....I'm going to investigate why scriptanalyzer changed the defaults (maybe it's just a @{} vs @{ } type of thing), but the point is that we DO have Invoke-Dbatoolsformatter .... and that we're one call of Invoke-Dbatoolsformatter away to consistently change one file style to a single one (and even the whole module if the new OTSB replaces all occurrences of @{} to @{ }).

As for "failing fast & providing a button to fix it" or "not failing and making sure who merges does it instead of the user", is just a matter of additional burden on the merger. We chose to fail fast to avoid the burden on mergers.
BTW, a few weeks ago on slack I was advocating to add a call to Invoke-Dbatoolsformatter to manual.pester.ps1, so at least users running tests will have the file formatted automatically before submitting.

@nmbell
Copy link
Author

nmbell commented Sep 24, 2020

The function was gathering a list of directory paths in $sqlpaths which is passed in to Get-SQLDirTreeQuery, but if any of those directories, which are passed to xp_dirtree, are in UNC format, the server won't return results if it doesn't support UNC paths. OK, now I see why it wouldn't work. I've updated the code to guarantee only local paths are passed in.

Without the Split-AdminUNC function, or some equivalent, I don't think there's any way for this:
image
to get to this:
Filename : C:\Program Files\Microsoft SQL Server\MSSQL14.SQL2017DEVELOPER\MSSQL\DATA\dbatoolsci_orphanedfile_TLOG.ldf
I think if you accept that SQL allows for specifying UNC paths, then converting back to a local path format is a necessity to maintain the function output.

Regarding recursion not strictly needing a CTE, I'll leave that for someone else to tackle. For now I've updated to keep the original behavior on pre-2005 versions.

And finally, a successful build :-). Creating a database with a UNC file path obviously doesn't work on the build server, so this will just have to remain a testing gap. Test of the recursive logic is working good, though :-).

@nmbell
Copy link
Author

nmbell commented Sep 24, 2020

My point about the formatting wasn't about whether it should be the committer or the merger that applies formatting (the issue is just as likely to occur if the merger's version of PSScriptAnalyzer is not the same as the build server, isn't it?). Having the build machine apply the formatting after a successful build side-steps the issue entirely: it will always be correct and never a burden for anyone. It was a non-trival effort to figure out why the build was failing even after I'd applied the formatting locally; OK, not a huge deal, but still a frustration that could have been avoided. The build setup isn't detailed anywhere (or is it?), so any new contributor is likely to experience the same. I included an output in one of the builds to expose the module versions: https://ci.appveyor.com/project/sqlcollaborative/dbatools/builds/35296201/job/r612h2n02ueqy36e. If formatting can't be automated, maybe at least the module versions can be exposed somewhere appropriate as a permanent part of the build output...?

@niphlod
Copy link
Contributor

niphlod commented Sep 24, 2020

The function was gathering a list of directory paths in $sqlpaths which is passed in to Get-SQLDirTreeQuery, but if any of those directories, which are passed to xp_dirtree, are in UNC format, the server won't return results if it doesn't support UNC paths. OK, now I see why it wouldn't work. I've updated the code to guarantee only local paths are passed in.

Without the Split-AdminUNC function, or some equivalent, I don't think there's any way for this:
image
to get to this:
Filename : C:\Program Files\Microsoft SQL Server\MSSQL14.SQL2017DEVELOPER\MSSQL\DATA\dbatoolsci_orphanedfile_TLOG.ldf
I think if you accept that SQL allows for specifying UNC paths, then converting back to a local path format is a necessity to maintain the function output.

not really. if SQL is using local paths, you must enum local paths.
if SQL is using UNC paths, you must enum UNC paths... because if the database is mounted of course the server has access to the share (at least the base folder of it, subdirectories may break ACL inheritance and refuse to be enumerated, but that's nothing SQL or you can solve).
point being you enum what SQL has mounted, without translations (also because of the following comma ^_^)

Your split-adminunc just works for paths that are \servername\diskname$\path (admin shares) and not \servername\sharename\path (normal share).... which from MSSQL 2012 is fully supported, db files on any SMB share.
So, you're not fixing a problem, you're introducing one.

Regarding recursion not strictly needing a CTE, I'll leave that for someone else to tackle. For now I've updated to keep the original behavior on pre-2005 versions.

NP, I'll push a fixed version

@nmbell nmbell closed this Sep 25, 2020
@niphlod niphlod mentioned this pull request Sep 29, 2020
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants