The DB Schema Export Tool exports SQL Server or PostgreSQL database objects as schema files. Exported objects include tables, views, stored procedures, functions, and synonyms, plus also database properties including database roles and logins.
Uses integrated authentication or a named user to connect to a Microsoft SQL Server database.
Uses a named user, password, and port to connect to a PostgreSQL database.
The latest version of the application is available for one month on the AppVeyor CI server
In addition to creating schema files for tables, views, stored procedures, functions, and synonyms, the program can also export table data to create SQL files with an Insert statement for each row.
The DB Schema Export Tool has both a command line interface and a GUI mode. Starting the program without any parameters will show the GUI.
The command line interface is useful for automatically exporting schema files on a regular basis. Example command (wrapped here for readability):
DB_Schema_Export_Tool.exe
C:\Cached_DBSchema
/Server:Proteinseqs
/DBList:Manager_Control,Protein_Sequences
/Sync:"F:\Projects\Database_Schema\DMS"
/Git /Commit
/L /LogDir:Logs
/Data:ProteinSeqs_Data_Tables.txt
DB_Schema_Export_Tool.exe
SchemaFileDirectory /Server:ServerName
/DB:Database /DBList:CommaSeparatedDatabaseName
[/DBUser:username] [/DBPass:password]
[/PgUser:username] [/PgPass:password] [/PgPort:5432]
[/DirectoryPrefix:PrefixText] [/NoSubdirectory] [/CreateDBDirectories]
[/DataTables:TableDataToExport.txt] [/Map:ColumnMapping.txt]
[/DataSort:PgDumpTableDataSortOrder.txt]
[/TableFilterList] [/SchemaSkipList]
[/ColumnFilter:FileName] [/DateFilter:FileName]
[/NameFilter:FilterSpec]
[/DataExportOrder:FileName]
[/Schema:SchemaName] [/ExistingDDL:SchemaFileName]
[/NoAutoData] [/ExportAllData] [/MaxRows:1000] [/NoData]
[/DisableStatementLogging:true] [/MinLogDuration:5000]
[/DeleteExtraRows] [/ForceTruncate:TableList]
[/DisableTriggers]
[/ScriptLoad] [/ScriptUser:username] [/ScriptDB:Database]
[/ScriptHost:HostName] [/ScriptPort:Port]
[/SnakeCase] [/PgDump] [/KeepPgDumpFile]
[/PgInsert] [/PgInsertChunkSize:50000]
[/ServerInfo] [/NoSchema]
[/Sync:TargetDirectoryPath] [/Git] [/Svn] [/Hg] [/Commit]
[/L] [/LogFile:BaseName] [/LogDir:LogDirectoryPath]
[/Preview] [/Stats] [/Trace]
[/ParamFile:ParamFileName.conf] [/CreateParamFile]
SchemaFileDirectory
is the path to the directory where the schema files will be saved (aka the output directory)
- Optionally use named parameter
/OutputDir:DirectoryPath
To process a single database, use /Server
and /DB
- The server is assumed to be Microsoft SQL Server
- However, if
/PgUser
is provided (orPgUser
is defined in the parameter file), will treat as a PostgreSQL server
Use /DBList
to process several databases (separate names with commas)
Use /DBUser
and /DBPass
to specify a username and password for connecting to a SQL Server instance
- If not defined, will use Windows Integrated Authentication
Use /PgUser
to define the username to use to connect to a PostgreSQL database
Use /PgPass
to define the password for the PostgreSQL user
- As described below, you can create a file named
pgpass.conf
at%APPDATA%\postgresql
to track passwords for PostgreSQL users - The format is
server:port:database:username:password
pg_dump.exe
will obtain the password from this file if it exists- On Linux, the file is stored at
~/.pgpass
Use /PgPort
to specify the port to use for PostgreSQL
- Defaults to 5432
By default, a subdirectory named DBSchema__DatabaseName will be created below SchemaFileDirectory/
- Customize this prefix text using
/DirectoryPrefix
Use /NoSubdirectory
to disable auto creating a subdirectory for the database being exported
- Note: subdirectories will always be created if you use
/DBList
and specify more than one database
Use /CreateDBDirectories:False
to disable creating a subdirectory for the schema files for each database
Use /DataTables
or /Data
to define a text file with table names (one name per line) for which the data
should be exported. In addition to table names defined in /Data
, there are default tables
which will have their data exported; disable the defaults using /NoAutoData
- Also supports a multi-column, tab-delimited format
- Put
<skip>
in the TargetTableName column to indicate that the table should not be included when using/ExportAllData
- Put
- The file can also track renamed views
- File format
SourceTableName | TargetSchemaName | TargetTableName | PgInsert | KeyColumn(s) |
---|---|---|---|---|
T_Analysis_State_Name | public | t_analysis_job_state | true | job_state_id |
T_DatasetRatingName | public | t_dataset_rating_name | true | dataset_rating_id |
T_Log_Entries | public | t_log_entries | false | |
T_Job_Events | cap | t_job_Events | false | |
T_Job_State_Name | cap | t_job_state_name | true | job |
T_Users | public | t_users | true | user_id |
x_T_MgrState | public | <skip> |
||
V_Cell_Culture | public | v_biomaterial | ||
V_Cell_Culture_Date | public | v_biomaterial_date |
Tables with PgInsert
set to true will have data insert commands formatted as PostgreSQL compatible
INSERT INTO
statements using the ON CONFLICT (key_column) DO UPDATE SET
syntax
- This is only applicable when exporting data from SQL Server
- Use the
KeyColumn(s)
column to specify the primary key (or pair of unique keys) used for determining a conflict- Any columns that are not key columns will be updated when a conflict is found
- If
PgInsert
is true, but theKeyColumn(s)
column is empty, the program will look for an identity column on the source table
- Example generated SQL:
INSERT INTO mc.t_mgr_types (mt_type_id, mt_type_name, mt_active)
OVERRIDING SYSTEM VALUE
VALUES
(1, 'Analysis', 1),
(2, 'Archive', 0),
(3, 'Archive Verify', 0),
(4, 'Capture', 1),
(5, 'Space', 1),
(6, 'Data Import', 1)
ON CONFLICT (mt_type_id)
DO UPDATE SET
mt_type_name = EXCLUDED.mt_type_name,
mt_active = EXCLUDED.mt_active;
-- Set the sequence's current value to the maximum current ID
SELECT setval('mc.t_mgr_types_mt_type_id_seq', (SELECT MAX(mt_type_id) FROM mc.t_mgr_types));
-- View the current ID for the sequence
SELECT currval('mc.t_mgr_types_mt_type_id_seq');
Use /Map
or /ColumnMap
to define a tab-delimited text file mapping source column names to target column names, by table
- Use keyword
<skip>
in theTargetColumnName
column to indicate that a source column should not be included in the output file - If
/SnakeCase
is used to auto-change column names, mappings in this file will override the snake case auto-conversion - File format:
SourceTableName | SourceColumnName | TargetColumnName |
---|---|---|
T_Analysis_Job | AJ_jobID | job |
T_Analysis_Job | AJ_start | start |
T_Analysis_Job | AJ_finish | finish |
t_users | name_with_prn | <skip> |
Use /DataSort
(or /PgDumpDataSort
or /PgDumpTableDataSortOrder
) to define a tab-delimited text file that defines the columns to use for sorting data exported using PgDump
- The export file will be re-written with data sorted by the specified column numbers (1-based)
- Table names should include the schema, e.g. "public.t_event_target" or "sw.t_job_state_name"
- Sort_Columns should list the column number (or comma-separated list of column numbers) to sort the data on
- "1" means to sort using the first column
- "2, 3" means to sort using the second column, then the third column
- Values in Sort_Numeric should be True, False, or an empty string
- True means to assume the data in the first sort column is numeric (integer or double)
- False is ignored; if any sort column has numbers for every row, it will be sorted as numeric
- File format:
Table_Name | Sort_Columns | Sort_Numeric |
---|---|---|
cap.t_automatic_jobs | 1, 2 | |
cap.t_scripts | 2 | |
cap.t_scripts_history | 3, 1 | |
cap.t_step_tools | 2, 1 | |
cap.t_task_state_name | 1 | True |
public.t_analysis_job_state | 1 | True |
public.t_default_psm_job_parameters | 2, 3, 4, 5, 6 | |
public.t_user_operations_permissions | 1, 2 | True |
sw.t_job_state_name | 1 | True |
timetable.task | 2, 1 | True |
Use /TableFilterList
or /TableNameFilter
to specify a table name (or comma separated list of names) to restrict table export operations
- This is useful for exporting the data from just a single table (or a few tables)
- This parameter does not support reading names from a file; it only supports actual table names
- If a file was defined via the
/DataTables
parameter, the list of tables loaded from that file will be filtered by this list (matching both SourceTableName and TargetTableName)
Use /SchemaSkipList
to specify a schema name (or comma separated list of names) of schema to skip
- This is useful if using partitioned tables
Use /ColumnFilter
or /TableDataColumnFilter
to define a tab-delimited text file that defines columns to skip when exporting data from tables
- The program auto-skips computed columns and timestamp columns when exporting data as PostgreSQL compatible INSERT INTO statements
- File format:
Table | Column | Comment |
---|---|---|
T_Cached_Dataset_Links | DS_RowVersion | In PostgreSQL, tracked via data type xid; comes from t_dataset.xmin |
T_Cached_Dataset_Links | SPath_RowVersion | In PostgreSQL, tracked via data type xid; comes from t_storage_path.xmin |
T_EUS_Proposals | Numeric_ID | Computed column (redundant here since the program auto-skips computed columns) |
Use /DateFilter
or /TableDataDateFilter
to define a tab-delimited text file that defines date filters to use when exporting data from tables
- The data file will include the start date in the name, for example:
mc.t_log_entries_Data_Since_2020-01-01.sql
- File format:
SourceTableName | DateColumnName | MinimumDate |
---|---|---|
T_Event_Log | Entered | 2020-01-01 |
T_Log_Entries | posting_time | 2020-01-01 |
T_ParamValue | last_affected | 2020-01-01 |
T_ParamValue_OldManagers | last_affected | 2016-01-07 |
Use /NameFilter
to define a filter to apply to exported tables, views, procedures, etc.
- Will only export objects that contain this text in the name
- This is only applicable when exporting objects from SQL Server
- Use
^
to indicate the object names must start with the text, for example^Auto
to start with Auto- This only works if using a parameter file, since the Windows command line parser removes ^
- Use
$
to indicate the object names must end with the text, for exampleReport$
to end with Report - Other RegEx qualifiers are supported; matches are case insensitive
- Multiple filters can be provided by separating with commas
- At the command line, surround comma separated lists with double quotes
- Will not split on commas if the object name filter has square brackets
Use /DataExportOrder
to define a text file with table names (one name per line) defining the order that data should be exported from tables
- Should have table names only, without schema
- Tables not listed in this file will have their data exported alphabetically by table name
- If the first line is Table_Name, will assume that it is a header line (and will thus ignore it)
- If
/ScriptLoad
is used to create a shell script for loading data (see below), the data will be loaded in the order specified by theDataExportOrder
file, followed by tables not listed in the file- This is useful for assuring that data is loaded into the database in an order that will guarantee that foreign keys are valid
- For example, given a multi-campus university that tracks graduate students and professors using four tables:
graph TD;
T_Campus-->T_Department;
T_Department-->T_Professor;
T_Department-->T_Student;
the DataExportOrder
file should have the following:
T_Campus
T_Department
T_Professor
T_Student
- Loading data in this order assures that when the department data is loaded, the Campus_ID for each department will refer to an existing row in the campus table
- Next, when the professor and student data is loaded, the Department_ID for each person will refer to an existing row in the department table
Use /Schema
or /DefaultSchema
to define the default schema name to use when exporting data from tables
- Entries in the
/DataTables
file will override this default schema
Use /ExistingDDL
or /ExistingSchema
to define a text file that should be parsed to rename columns, based on data loaded from the /ColumnMap
file
- Will create a new text file with CREATE TABLE blocks that include the new column names
- Will skip columns with
<skip>
- Will skip columns with
- Will skip tables (and views, procedures, etc.) that are defined in the
/DataTables
file but have<skip>
in the TargetTableName column- Does not rename tables
Use /ExportAllData
or /ExportAllTables
to export data from every table in the database
- Will skip tables (and views) that are defined in the
/DataTables
file but have<skip>
in the TargetTableName column - Ignored if
/NoTableData
is true
Use /MaxRows
to define the maximum number of data rows to export
- Defaults to 1000
- Use 0 to export all rows
- If you use
/ExportAllData
but do not specify/MaxRows
, the program will auto set/MaxRows
to 0
Use /NoTableData
or /NoData
to prevent any table data from being exported
- This parameter is useful when processing an existing DDL file with
/ExistingDDL
By default, the table data load script will update the PostgreSQL system settings to set log_min_duration_statement
to -1, which will disable logging statements that execute longer than a given threshold
- If your system has
log_min_duration_statement
defined, and if the insert queries take longer than the threshold, the entire insert query (including all of the data values) will be written to the log file- If you are importing a large amount of data, this could lead to very large PostgreSQL log file sizes
- By using
/DisableStatementLogging:True
(the default) you avoid the possibility of long-runningINSERT INTO
queries bloating up the log files- The database user must be a superuser to update the
log_min_duration_statement
system setting, since that setting affects all current and future connections in all databases on the server
- The database user must be a superuser to update the
- After importing all of the data, the script will set
log_min_duration_statement
to 5000 (meaning to log statements that run longer than 5 seconds)- This can be customized using
/MinLogDuration
(orMinLogDurationAfterLoad=
in a parameter file)
- This can be customized using
- Use
/DisableStatementLogging:False
to disabling adjusting the value oflog_min_duration_statement
Use /MinLogDuration
to customize the value to set log_min_duration_statement
to once all of the data has been imported
- The default is
/MinLogDuration:5000
Use /DeleteExtraRows
to create files with commands to delete extra rows in database tables (provided the table has one or more primary keys)
-
If
/ScriptLoad
is enabled (or the parameter file hasScriptLoad=True
), the shell script will use these files to delete extra rows prior to importing new data -
If
/DataExportOrder
was used to specify the export order for table data, the shell script will delete extra rows from tables in the reverse order from that defined in theDataExportOrder
file- The logic for this is that the purpose of the
DataExportOrder
file is to assure that
- The logic for this is that the purpose of the
-
If a table has a single-column primary key, the delete queries will be of the form:
DELETE FROM "t_user_status" target
WHERE status BETWEEN 'Active' AND 'Obsolete' AND
NOT EXISTS ( SELECT S.status
FROM "t_user_status" S
WHERE target.status = S.status AND
S.status in (
'Active','Inactive','Obsolete'
)
);
- If a table has a multi-column based primary key, the delete queries will be of the form:
DELETE FROM t_internal_std_composition target
WHERE target.component_id BETWEEN 10 AND 72 AND
NOT EXISTS ( SELECT S.component_id
,S.mix_id
FROM t_internal_std_composition S
WHERE target.component_id = S.component_id AND
target.mix_id = S.mix_id AND
(
S.component_id = 10 AND S.mix_id = 1 OR
S.component_id = 10 AND S.mix_id = 3 OR
S.component_id = 11 AND S.mix_id = 1 OR
S.component_id = 11 AND S.mix_id = 3 OR
S.component_id = 13 AND S.mix_id = 1 OR
S.component_id = 13 AND S.mix_id = 3 OR
S.component_id = 21 AND S.mix_id = 5 OR
S.component_id = 22 AND S.mix_id = 5
)
);
Use /ForceTruncate
to specify a table name (or comma separated list of names) that have a primary key, but you would rather truncate the table data instead of using either of the DELETE FROM
queries shown above
- This is useful for large tables where the
DELETE FROM
queries run very slowly - Names in this list can match either source table names or target table names
- Ignored if
/DeleteExtraRows
is false
Use /DisableTriggers
(or /IncludeDisableTriggerCommands
) to include trigger disable / enable commands in data export files
- Ignored if
PgInsert
is true sincesession_replication_role
is set to "replica" prior to inserting/deleting data
Use /ScriptLoad
or /Script
to create a bash script file for loading exported table data into a PostgreSQL database
Use /ScriptUser
to define the username to use when calling psql
in the bash script
- If not defined, will use the username being used to run the DB Schema Export Tool
Use /ScriptDB
to define the target database name to use when calling psql
in the bash script
- The default is
/ScriptDB:dms
Use /ScriptHost
to define the host name to use when calling psql
in the bash script
- The default is
/ScriptHost:localhost
Use /ScriptPort
to define the port name to use when calling psql
in the bash script
- The default is
/ScriptPort:5432
- The port number will only be included in the bash script if it is not the default
Use /SnakeCase
to auto change column names from Upper_Case and UpperCase to lower_case when exporting data from tables
- Also used for table names when exporting data from tables
- Entries in the
/DataTables
and/ColumnMap
files will override auto-generated snake_case names
Use /PgDump
or /PgDumpData
to specify that exported data should use COPY
commands instead of INSERT INTO
statements
- With SQL Server databases, table data will be exported using pg_dump compatible COPY commands when
/PgDump
is used - With PostgreSQL data, table data will be exported using the pg_dump application
- With SQL Server data and PostgreSQL data, if
/PgDump
is not provided, data is exported withINSERT INTO
statements - With SQL Server data, if
/PgDump
is provided, but the/DataTables
file hastrue
in thePgInsert
column,INSERT INTO
statements will be used for that table
Use /KeepPgDumpFile
to not delete the _AllObjects_.sql
created for each database
- By default, this is false, meaning the
_AllObjects_.sql
file will be deleted (since it is no longer needed because a separate .sql file has been created for each object) ** If any unhandled scripting commands are encountered, the_AllObjects_.sql
file will not be deleted, even if KeepPgDumpFile is false - In addition, when this is true, store the original PgDump table data files in directory
Replaced_PgDump_Files
The /PgInsert
applies when exporting data from SQL Server database tables
- While reading the TableDataToExport.txt file, default the PgInsert column to true when
/PgInsert
is provided, meaning table data will be exported via INSERT INTO statements, as mentioned above - Ignored for PostgreSQL data
Use PgInsertChunkSize
to specify the number of values to insert at a time when PgInsert is true for a table
- Only applicable when exporting data from SQL Server
Use /ServerInfo
to export server settings, logins, and SQL Server Agent jobs
Use /NoSchema
to skip exporting schema (tables, views, functions, etc.)
Use /Sync
to copy new/changed files from the output directory to an alternative directory
- This is advantageous to prevent file timestamps from getting updated every time the schema is exported
Use /Git
to auto-update any new or changed files using Git
Use /Svn
to auto-update any new or changed files using Subversion
Use /Hg
to auto-update any new or changed files using Mercurial
Use /Commit
to commit any updates to the repository
Use /L
to log messages to a file
Use /LogFile
to specify the base name for the log file
- Log files will be named using the base name, followed by the current date
Use /LogDir
or /LogDirectory
to specify the directory to save the log file in
- By default, the log file is created in the current working directory
Use /Preview
to count the number of database objects that would be exported
Use /Stats
to show (but not log) export stats
Use /Trace
to show additional debug messages at the console
The processing options can be specified in a parameter file using /ParamFile:Options.conf
or /Conf:Options.conf
or /P:Options.conf
- Define options using the format
ArgumentName=Value
- Lines starting with
#
or;
will be treated as comments - Additional arguments on the command line can supplement or override the arguments in the parameter file
Use /CreateParamFile
to create an example parameter file
- By default, the example parameter file content is shown at the console
- To create a file named Options.conf, use
/CreateParamFile:Options.conf
This program leverages the SQL Server Management Objects (SMO) Framework
- All of the required DLLs should be included in the DBSchemaExportTool .zip file
In order to use the /Git,
/Svn,
or /Hg
switches, you need the following software installed
and the executables present at a specific location:
Git
C:\Program Files\Git\bin\git.exe
- Installed with 64-bit Git for Windows, available at https://git-scm.com/download/win
Subversion:
C:\Program Files\TortoiseSVN\bin\svn.exe
- Installed with 64-bit Tortoise SVN, available at https://tortoisesvn.net/downloads.html
Mercurial
C:\Program Files\TortoiseHg\hg.exe
- Installed with 64-bit Tortoise Hg, available at https://tortoisehg.bitbucket.io/download/index.html
Schema export from PostgreSQL databases utilizes the pg_dump utility
pg_dump.exe
is available as part of the PostgreSQL for Windows installer from the EDB website- Run the installer
- De-select installing PostgreSQL Server
- Copy
pg_dump.exe
plus several DLLs fromC:\Program Files\PostgreSQL\16\bin
to the directory with the DB_Schema_Export_Tool executable- See batch file
Copy_pg_dump_files.bat
- See batch file
File pgpass.conf
at %APPDATA%\postgresql
can be used to store the password for the PostgreSQL user
- On Linux, the file is stored at
~/.pgpass
mkdir %APPDATA%\postgresql
echo prismweb3:5432:dms:dmsreader:dmspasswordhere > %APPDATA%\postgresql\pgpass.conf
Example call to pg_dump:
pg_dump.exe -h prismweb3 -p 5432 -U dmsreader -d dms --schema-only --format=p --file=_AllObjects_.sql
Note that on Windows the .sql file created by pg_dump.exe
includes extra carriage returns, resulting in CR CR LF
instead of CR LF
- DB_Schema_Export_Tool.exe removes the extra carriage returns while parsing the SQL dump file
Written by Matthew Monroe for the Department of Energy (PNNL, Richland, WA)
E-mail: matthew.monroe@pnnl.gov or proteomics@pnnl.gov
Website: https://github.com/PNNL-Comp-Mass-Spec/ or https://www.pnnl.gov/integrative-omics
Source code: https://github.com/PNNL-Comp-Mass-Spec/DB-Schema-Export-Tool
The DB Schema Export Tool is licensed under the 2-Clause BSD License; you may not use this program except in compliance with the License. You may obtain a copy of the License at https://opensource.org/licenses/BSD-2-Clause
Copyright 2019 Battelle Memorial Institute