Demo converting DB2 database to postgresql using DB2 on an EC2 instance with SCT and DMS. Additional option to use Kinesis in between DMS and Aurora.
- Overview
- AWS Services Used
- Technical Overview
- Instructions
- DB2 Setup
- Windows Steps
- DB2 with DMS
- Create DMS Resources
- Cleaning up
Initially used a CloudFormation template from Data Migration Immersion Day. This template has been modified to include an IBM DB2 instance with DMS components. This readme documents using SCT/DMS to convert DB2 sample database to Aurora PostgreSQL.
- AWS DMS Database Migration Service
- AWS SCT Schema Conversion Tool
- AWS Cloudformation
- AWS DMS Workshop
- Amazon Kinesis
- Bring up DMS/SCT environment using modified immersion days template
- Review Security Group Settings
- Setup VNC viewer on redhat instance. VNC is needed for using IBM DB2 Setup Wizard
- A db2 installation response file is available for a silent install to avoid need for VNC
- Install DB2 and create sample DB2 database
- Use SCT and DMS to convert sample DB2 database to Aurora PostgreSQL
- Additional DMS setup to use kinesis
*IMPORTANT NOTE: Creating this demo application in your AWS account will create and consume AWS resources, which will cost money. Costing information is available at AWS DMS Pricing The template will cost money for the other resources as well.
ADDITIONAL NOTE: Latest IBM DB2 version supported as of January 2021 is 11.1 with all Fix Packs. I worked through the entire process using IBM DB2 v11.5 and everything works until the very last step to move the data with DMS. This step fails to find the DB2 tables and no data is moved. DB2 11.5 is not supported by DMS and there is not an ETA for that support. Free trial downloads of earlier DB2 version are not available. I successfully used DB2 11.1 FixPack 5 for this exercise. I am unclear on usage limitation with this downloadable content so use this at your own discression.
- Start with AWS Account instructions
- After reviewing "Introduction" and "Getting Started", follow the Regular AWS Account instructions.
- Complete the "Login to the AWS Console" and "Create an EC2 Key Pair" steps
- In the "Configure the Environment" step, use the provided ./templates/DB2Workshop.yaml file instead of the link. Choose SQL Server for the source database
Additional ports need to be open to allow VNC connectivity to the redhat 8 instance to install DB2. Additionally, using additional agents for DMS, can require additional ports to be open
- Find the security group. There are two security group created with the template. Click on the InstanceSecurityGroup (not the DMSSecurityGroup)
- Tighten security on the RDP rule. Currently it is open to public
- two choices-can open the ports to amazon ip using the amazon VPN. problem vnc will not work while on amazon VPN. vnc is needed for DB2 install.
- it is best to log out of amazon vpn and use the non-amazon ip address
- get amazon ip address using amazon checkip URL
- get ip address using a "get my ip" search
- a separate option is to use tightvnc client on the windows VM (vnc viewer is flaky on windows)
- Click "Edit Inbound Rules"
- on the RDP inbound rule, remove "0.0.0.0/16" and put in the address obtained in checkip with a /32 e.g. "1.2.3.4/32"
- open all internal communication on private. Easy way is to change the inbound rule with Access Type of Oracle-RDS to All TCP.
- Click Add Rule and add a new rule for VNC access from the address found in checkip. Use port range of 5900-6000
- Click Add Rule and add a new rule for SSH access from the address found in checkip. Use inbound rule type "SSH"
- Click "save rules"
- two choices-can open the ports to amazon ip using the amazon VPN. problem vnc will not work while on amazon VPN. vnc is needed for DB2 install.
VNC is needed to do the IBM install on the redhat instance unless a silent install is chosen with the response file. If using silent install, this section on VNC setup can be skipped. I have provided a response file in this github at db2 install response file This could eliminate the need for VNC. Additionally, there is db2_install script which may work without VNC. I have not verified either of these approaches so these instructions require VNC setup.
To login from client to redhat instance
ssh -i keypairFile ec2-user@redhatIPaddress
- Install needed packages to redhat instance- NOTE: git is not needed for VNC but needed for later
sudo yum groupinstall 'Server with GUI'
sudo yum install pixman pixman-devel
sudo yum -y install tigervnc-server git
- change ssd to allow password login with ssh
sudo vi /etc/ssh/sshd_config
Set the ChallengeResponseAuthentication yes
Set the PasswordAuthentication yes
# save the file
# restart sshd
sudo systemctl restart sshd.service
- disable firewalld
sudo systemctl stop firewalld
sudo systemctl disable firewalld
- define port 1 for ec2-user
sudo bash
echo ":1=ec2-user" >> /etc/tigervnc/vncserver.users
- set up VNC password as ec2-user
vncpasswd
Password:
Verify:
Would you like to enter a view-only password (y/n)? n
- set up an os password for ec2-user probably easier if you choose same password as VNC password (be logged in as ec2-user)
sudo passwd ec2-user
- add a vnc config file as ec2-user
echo "session=gnome" >> ~/.vnc/config
echo "alwaysshared" >> ~/.vnc/config
- ensure these two lines in /etc/tigervnc/vncserver-config-mandatory
session=gnome
alwaysshared
- setup mandatory vncserver
- change vi /etc/gdm/custom.conf under the default section WaylandEnable=false DefaultSession=gnome-xorg.desktop
- start and enable tigervnc
sudo systemctl start vncserver@:1
sudo systemctl enable vncserver@:1
sudo systemctl set-default graphical
- follow disable se linux URL
- Check the vnc process to ensure it is fully running
sudo systemctl status vncserver@:1
● vncserver@:1.service - Remote desktop service (VNC)
Loaded: loaded (/usr/lib/systemd/system/vncserver@.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2021-01-05 02:37:52 UTC; 26s ago
Process: 2599 ExecStart=/usr/libexec/vncsession-start :1 (code=exited, status=0/SUCCESS)
Main PID: 2605 (vncsession)
Tasks: 1 (limit: 98872)
Memory: 1.8M
CGroup: /system.slice/system-vncserver.slice/vncserver@:1.service
‣ 2605 /usr/sbin/vncsession ec2-user :1
Jan 05 02:37:52 xxxxx.ec2.internal systemd[1]: Starting Remote desktop service (VNC)...
Jan 05 02:37:52 xxxxx.ec2.internal systemd[1]: Started Remote desktop service (VNC).
- Redhat 8 PDF on gnmome and remote access
- screen saver lock out says "Cloud User" this is actually ec2-user so enter the OS password for ec2-user
- TightVNC on windows is easiest connection (VNC Viewer on windows did not work)
- VNC Viewer on Mac seems to work but not with Amazon VPN so it becomes inconvenient especially with mac port used in security group
sudo bash
yum install -y libstdc++.i686
yum install -y pam.i686
yum install -y gcc-c++ cpp gcc kernel-devel make patch
yum install -y libaio
DB2 needs VNC to do the install with the setup wizard. There is also a silent option not needing vnc with a response file. In the github home the response file is db2response
- NOTE: At the time, db2 11.1 fixpack 5 is most recent version supported by AWS DBS
- Using the VNC connection to the IBM server, go to download page for DB2 software DB2 software
- an IBMID will be needed for this (it is free)
- Scroll down to "DB2 Fix Packs" button and click on the version
- Scroll down and click the download link for
- Scroll down on this page to find the
- Download the file. This is the name of the file: v11.1.4fp5_linuxx64_server_t.tar.gz
- create holding file for DB2 Software (not necessary but makes later documentation easier)
sudo bash
cd /home
mkdir software
chmod 777 software
exit
- Move the downloaded db2 installation file to the /home/software and decompress the file using tar -xvzf
- rename directory so what was the server_t directory becomes ibm-db2
sudo bash
cd /home/softare
tar -xvzf "tar file name"
mv server_t ibm-db2
There are two ways to do this. Both methods use db2setup. If VNC is setup use the setup wizard by starting db2setup with any parameters. If VNC is not available, do a silent install using the provided db2 response file in the github
- Do a root based install using ./db2setup
- If VNC is available
sudo bash
cd /home/software/ibm-db2/
./db2setup
- if VNC is not available
- copy db2server.rsp file to redhat server from the githbu
ssh -i "path to ssh key file" db2server.rsp ec2-user@"ip address for node":/home/software/ibm-db2
* run db2setup with response file
sudo bash
cd /home/software/ibm-db2/
./db2setup -r db2server.rsp
# must set a password for db2inst1 user
passwd db2inst1
- db2 validation
sudo bash
su - db2inst1
db2val
Obviously, skip this if everything is fine. These are steps in case of problems. If going well skip forward to DB2 Sample Database
- this IBM link IBM troubleshoot is helpful. Odd that it says to ignore the 32 bit libraries.
- here are some maybe overly complex install instructions complex install instructions
- the DB2 install is hard to fix if there is a mistake. It is easier to re-install
- To re-install DB2, make sure everything is deleted. These delete steps help but are not complete.
sudo bash
su - db2inst1
# not all these steps may be needed but without drop can fail with sql1035N
db2 connect to sample
db2 unquiesce db
db2 connect reset
db2 drop database sample
exit
# now back as root
# get DB2 scripts in the profile
. /home/db2inst1/sqllib/db2profile
cd /opt/ibm/db2/V11.1
db2stop force
db2 terminate
/opt/ibm/db2/V11.1/instance/db2idrop db2inst1
/opt/ibm/db2/V11.1/install/db2_deinstall -a
# if you don't delete this user, db2 will create db2inst2
# if you don't remove the home directory, the files are left owned by previous UID causing problems
# this will fail if any processes are running like vnc for db2inst1
# to make sure, stop db2inst1 vncserver is running
systemctl stop vncserver@:2
userdel db2inst1
userdel db2fenc1
# check for any other db2 users and delete if found
grep db2 /etc/passwd
rm -rf /home/db2inst1
rm -rf /home/dbfenc1
rm -rf /opt/ibm/db2/V11.1
- create sample database. For more information look at documentation sample database db2sampl
sudo bash
su - db2inst1
db2sampl
- get familiar with DB2 and sample database using this document The install directions are dated but the rest is very good. Be careful what steps you run as some of these commands are "impactful"
This is a very optional step but handy for using VNC type tools with db2inst1. Only do this if VNC was configured above
- define port 2 for db2inst1
sudo bash
echo ":2=db2inst1" >> /etc/tigervnc/vncserver.users
- set up VNC password as db2inst1 (be logged in as db2inst1)
vncpasswd
Password:
Verify:
Would you like to enter a view-only password (y/n)? n
- start and enable tigervnc second session
sudo systemctl start vncserver@:2
sudo systemctl enable vncserver@:2
sudo systemctl set-default graphical
- download free trial download db2 trials
- click on download for free
- select "Microsoft Windows(x64) Download"
- expand the software zip file
- expand in software tree to find "setup.exe" and double click
- find "Install Data Server Client". follow prompts to install
- On Windows machine, download IBM Data Studio client software Download Data Studio
- Click on download link for Windows
- After download completes, expand the zip file
- Within the ibm_ds4130_win folder, Double-click the launchpad.exe and follow the prompts to install
Git is needed to pull this repository onto the windows machine. The repository has some needed scripts.
- Got to the git download
- Download the latest git release for windows
- Double-click on the git exe file and follow the prompts to install
- clone the git project using a windows DOC command line
cd "Desktop\DMS Workshop"
git clone https://github.com/jphaugla/awsDB2ToAuroraPostgres.git
Return back to the DMS and SCT steps using the SQL Server to Amazon Aurora PostgreSQL
- Start back at this point in the guide
- Perform the following Part 1 Schema Conversion Steps: "Connect to the EC2 Instance", "Install the AWS Schema Conversion Tool (AWS SCT)"
- Restart the Windows Server (this seems to be important after the SCT install)
- Create a New Project using New Project Wizard
- Connect to DB2
- if requested for the db2 LUW driver point at db2jcc4.jar file (usually there are several of them after installing the drivers and the client drivers)
- Accept the risks
- Click on the "DB2INST1" instance on the left panel and click "Next" to generate the assessment report
- Click Next and enter parameters for Aurora PostgreSQL connection
- To find the password, look back in the original templates/DMSWorkshop.yaml in the repository home
- Click "Finish"
- Right click on the "DB2INST1" database in the left panel and select Convert Schema to generate the data definition language (DDL) statements for the target database.
- Right click on the db2inst1 schema in the right-hand panel, and click Apply to database. click Yes
- Can see the tables now in the target database. Use the pg4admin tool to verify the new schema exists but is not populated.
- if can't connect through ports
- Disable Windows Defender Disable Defender
- Restart the windows machine (yep, it is not 1995 but restarting a windows machine never hurts!)
Several links for background on changes needed for DB2 setup with DMS
- Link for using IBM DB2 as a source for DMS DB2 DMS
- Link for changing logarchmeth1 on db2 11.5 Turn on Log
- Link to change Configuration Parameters. Configuration Parameter Change
- NOTE: LOGARCHMETH1 and LOGARCHMETH2 are for newer versions of DB2. Older versions use LOGRETAIN
Choose most comfortable method for making the configuration parameter. Following steps are using a command line unix connection
- Turn on replication from unix connection to the redhat Db2 instance
su - db2inst1
db2 connect to sample
# turn replication on
db2 update db cfg for sample using logarchmeth1 logretain
# database must be deactivated and reactivated. THis is the easiest method but less aggressive methods may also work
db2stop
# if this does not work and this message appears "SQL1025N The database manager was not stopped because databases are still active."
db2stop force
# restart
db2start
# backup must be taken once logretain is enabled or following error will be returned "SQL1116N A connection to or activation of the database cannot be made because of BACKUP PENDING."
db2 backup database sample
Open the pgadmin tool by clicking on the elephant icon on the bottom of the windows server This is very similar to the immersion day Configure the Target DataBase step Configure Target but using the drop_constraints from the git repo
- Using the tree structure on the left, find the AuroraDB database and then the db2inst1 schema and click on the schema
- Use Tools->Query Tool to open the query tool
- click on the "open folder" icons on far left of the icons for pagamin query tool and open the drop_constraints script
- the script is at C:\Users\Administrator\Desktop\DMS Workshop\awsDB2ToAuroraPostgres\scripts\drop_constraints.sql
These resources are automatically created if the flag is set in the script but the task must be started manually
Some choices here. In addition, can add a separate Migration Task using the same DB2 source tables but with a target Kinesis endpoint. In this scenario, there are no mapping rules but a separate Kinesis stream is needed for each table. The streams are defined in the cloud formation template as well.
- Follow steps from immersion training
If the parameter "CreateDMSComponents" in the initial Cloudformation template was set to true, these components will already be created and need validation. Test the source DB2 endpoint. Verify aurora and kinesis endpoint tests was successful.
- Follow these Steps
- Use these parameters for the source
- THe specific PostgreSQL endpoints, KinesisEndpoints and TargetKinesis roles are output in the cloudformation
Similar to this create DMS Migration Task If the parameter "CreateDMSComponents" in the initial Cloudformation template was set to true, the Task will already be created. Just start it.
- add a selection rule where schema name is like "DB2INS%"
- DB2 uses upper case schema, table, and column names so these must all be converted in mapping rules
- add 3 separate mapping rules for columns, tables and schema
- using the wizard you need this. (instead can put JSON in directly-down below)
where schema name is like '%' and table name is like '%' and column name is like "%", convert-lowercase
where schema name is like '%' and table name is like '%', convert-lowercase
where schema name is like '%' convert-lowercase
- JSON to use directly
{"rules":[{"rule-type":"transformation","rule-id":"1","rule-name":"1","rule-target":"column","object-locator":{"schema-name":"%","table-name":"%","column-name":"%"},"rule-action":"convert-lowercase","value":null,"old-value":null},{"rule-type":"transformation","rule-id":"2","rule-name":"2","rule-target":"table","object-locator":{"schema-name":"%","table-name":"%"},"rule-action":"convert-lowercase","value":null,"old-value":null},{"rule-type":"transformation","rule-id":"3","rule-name":"3","rule-target":"schema","object-locator":{"schema-name":"%"},"rule-action":"convert-lowercase","value":null,"old-value":null},{"rule-type":"selection","rule-id":"4","rule-name":"4","object-locator":{"schema-name":"DB2INST%","table-name":"%"},"rule-action":"include","filters":[]}]}
- Upgrading to current verson of DMS
- This is the error if 11.5 is used
2021-01-08T01:36:14 [TASK_MANAGER ]E: No tables were found at task initialization. Either the selected table(s) no longer exist or no match was found for the table selection pattern(s). [1021707] (replicationtask.c:2107)
- Add a kinesis Stream with a shard for each table planning to be moved to kinesis
- Add a kinesis endpoint with these settings:
- MessageFormat: json
- ServiceAccessRoleArn "Kinesis Role ARN" see this link for role requirements
- StreamArn: use kinesis stream ARN from above
- Could not get this to work on the cloudformation but add these two parameters
- IncludePartitionValue: true
- PartitionIncludeSchemaTable: true
- Create DMS Migration Task
- SourceEndpoint is DMSDB2Endpoint
- TargetEndpoint is DMSKinesisEndpoint
- JSON mapping string
{"rules":[{"rule-type":"selection","rule-id":"1","rule-name":"1","object-locator":{"schema-name":"DB2INST%","table-name":"EMPLOYEE"},"rule-action":"include","filters":[]},{"rule-type":"selection","rule-id":"2","rule-action":"include","object-locator":{"schema-name":"DB2INST%","table-name":"DEPARTMENT"},"rule-name":"2"}]}
Remove all files from S3 to avoid accruing unnecessary charges