====== DB2 High Availability Data Replication ======
===== Single standby setup =====
==== Primary node ====
**1. Identify or create database for HADR**
db2 create database ...
**2. Configure DB for log archiving**
db2 update db cfg for using LOGARCHMETH1 LOGRETAIN
Above command will enable to database for log archiving and keep the logs in the same active log directory. This will also place the db in backup pending state.
**3. Configure HADR DB parameters on Primary database**
db2 update db cfg for using HADR_LOCAL_HOST
db2 update db cfg for using HADR_LOCAL_SVC
db2 update db cfg for using HADR_REMOTE_HOST
db2 update db cfg for using HADR_REMOTE_SVC
db2 update db cfg for using HADR_REMOTE_INST
db2 update db cfg for using LOGINDEXBUILD ON
db2 update db cfg for using HADR_SYNCMODE SYNC
**4. Take an offline backup to be used for HADR configuration**
db2 "backup database [to '/path/to/backup'|use TSM]"
**5. Transfer the backup image to the Standby node or grant access to Standby TSM node**
==== Standby node (single standby node) ====
Ensure both the servers are on the same db2level so that a mismatch situation does not occur. Run "db2level" command on both the servers to check whether they are on the same DB2 Version and Fix Pack.
**6. Restore the database from backup image (disk or TSM)**
db2 "restore database DBNAME [from '/path/to/backup'|use TSM] taken at "
**7. Configure HADR DB parameters on Standby DB**
db2 update db cfg for using HADR_LOCAL_HOST
db2 update db cfg for using HADR_LOCAL_SVC
db2 update db cfg for using HADR_REMOTE_HOST
db2 update db cfg for using HADR_REMOTE_SVC
db2 update db cfg for using HADR_REMOTE_INST
**8. Start HADR on Standby DB**
db2 start hadr on database as standby
==== Primary node ====
**9. Start HADR on Primary DB**
db2 start hadr on database as primary # Standby node must be already started
**10. Check HADR is up and running**
db2pd -db -hadr
Hostname of the HADR pair cannot be the same on both the servers.
The instance name and the underlying userid on UNIX systems can be different. Make sure to update the correct name of the instance for the db cfg parameter HADR_REMOTE_INST to the correct value.
===== Multinode Standby setup =====
===== Start, Stop, Takeover =====
==== Start ====
db2 start hadr on database as primary by force # Doesn't wait for Standby
db2 start hadr on database as primary # Wait's for Standby
db2 start hadr on database as standby
=== Stop ===
db2 stop hadr on database
=== Takeover ===
db2 takeover hadr on database # graceful takeover
db2 takeover hadr on database by force # forceful after disaster
===== Automatic Client Reroute =====
**Update alternate server location :**
db2 update alternate server for db using hostname port
**JDBC URL for Automatic client reroute :**
url="jdbc:db2://t_db2test01:60000/DBATOOLS:clientRerouteAlternateServerName=t_db2test02;clientRerouteAlternatePortNumber=60000;"
**DB2CLI configuration :**
[HADRDB2]
uid=dbatools
pwd=
database=DBATOOLS
protocol=tcpip
hostname=t_db2test01
servicename=60000
Althostname=t_db2test02
AltPort=60000
===== Monitoring =====
**Monitor HADR :**
db2pd -d -hadr
#or
db2top -d #then A
**HADR Congestion monitoring :**
db2fodc -hadr -db -detect
# options
db2fodc -hadr -db -detect iteration=1 sleeptime=0 triggercount=10 interval=30
# disable the monitoring
db2fodc -detect off
**Sustained trap :**
db2pdcfg -trapresilience
=== HADR Synchronization modes : ===
**SYNC (synchronous)**
This mode provides the greatest protection against transaction loss. However using it may result in the longest transaction response time among the four modes.
In this mode, log writes are considered successful only when logs have been written to log files on the primary database and when the primary database has received acknowledgement from the standby database that the logs have also been written to log files on the standby database. The log data is guaranteed to be stored at both sites.
**NEARSYNC (near synchronous)**
While this mode has a shorter transaction response time than synchronous mode, it also provides slightly less protection against transaction loss.
In this mode, log writes are considered successful only when the log records have been written to the log files on the primary database and when the primary database has received acknowledgement from the standby system that the logs have also been written to main memory on the standby system. Loss of data occurs only if both sites fail simultaneously and if the target site has not transferred to nonvolatile storage all of the log data that it has received.
**ASYNC (asynchronous)**
Compared with the SYNC and NEARSYNC modes, the ASYNC mode results in shorter transaction response times but may cause greater transaction losses if the primary database fails
In ASYNC mode, log writes are considered successful only when the log records have been written to the log files on the primary database and have been delivered to the TCP layer of the primary system's host machine. Because the primary system does not wait for an acknowledgement from the standby system, transactions may be considered committed when they are still on their way to the standby database.
**SUPERASYNC (super asynchronous)**
This mode has the shortest transaction response time but also has the highest probability of transaction losses if the primary system fails. This mode is useful when one does not want transactions to be blocked or experience elongated response times due to network interruptions or congestion.
In this mode, the HADR pair will never be in peer state or disconnected peer state. The log writes are considered successful as soon as the log records have been written to the log files on the primary database. Because the primary database does not wait for acknowledgement from the standby database, transactions are considered committed irrespective of the state of the replication of that transaction.
Since the transaction commit operations on the primary database are not affected by the relative slowness of the HADR network or the standby HADR server, the log gap between the primary database and the standby database may continue to increase. It is important to monitor the log gap as it is an indirect measure of the potential number of transactions that could be lost should a true disaster occur on the primary system.
In disaster recovery scenarios, any transactions committed during the log gap would not be available to the standby database. Therefore, monitor the log gap by using the hadr_log_gap monitor element. If it occurs that the log gap is not acceptable, investigate the network interruptions or the relative speed of the standby database node and take corrective measures to reduce the log gap.