db2_troubleshooting

Troubleshooting

Create $profile :

New-item –type file –force $profile

Setup DB2 profile :

notepad $profile

Set-ExecutionPolicy remoteSigned

set-item -path env:DB2CLP -value "**$$**"
Set-Alias vi notepad
Set-Alias ll dir

Take trace :

db2trc on -l 512m -t

Format trace dump :

db2trc perfrep trc.dmp trc.perf -g -sort timeelapsed -order desc

Extract / Update current configuration policy XML to ~/sqllib/tmp :

* MAINTENANCE_WINDOW
* AUTO_BACKUP
* AUTO_REORG
* AUTO_RUNSTATS

db2 "call AUTOMAINT_GET_POLICYFILE('MAINTENANCE_WINDOW','MaintenanceWindow_current.xml')"

db2 "call AUTOMAINT_SET_POLICYFILE('MAINTENANCE_WINDOW','MaintenanceWindow_NEW.xml')"

db2aic - Asynchronous Index cleaner db2taskd - Task distribution daemon

Workaround for slow rollback - disable logging in flight recorder (> 10.5 FP6):

db2pd -db <DB> -dmpevrec comp=SQLRA name=sqlraMED mask=0
db2pd -db <DB> -dmpevrec comp=SQLRA name=sqlraLOW mask=0

Automatic forcing of idle applications :

db2 "create threshold <NAME> for database activities enforcement database enable
when connectionidletime > <TIME> collect activity data none stop execution"

Terminate FMP (hung sub-agent) :

db2pd -fmp (search for active threads)
FMP Process:
Address FmpPid Bit Flags ActiveThrd PooledThrd ForcedThrd Active IPCList
0x07800000077BD000 38732016 64 0x00000002 2 1 0 Yes 0x0780000006FD1600 
Active Threads:
Address FmpPid EduPid ThreadId
0x0780000006D3FDC0 38732016 13369 1029 
0x0780000006FD0A80 38732016 20027 1286 

db2fmpterm 38732016 

find package name for bind file :

ddcspkgn db2_adminotm.bnd 

Check CPU utilization of DB2 processes :

ps -ef | grep db2sysc
top -Hp <DB2SYSCPID>
ps -Lp <DB2SYSCPID> -o pid,ppid,pcpu,pmem,lwp
db2pd -edus

db2 get snapshot for applications on <DBNAME> | grep -ie "application handle" -e "coordinator agent"

OS Memory :

svmon | awk '/memory/ { fm=$4; ms=$2; } /pg space/ {print "percent available: " ((fm+$3-$4)/(ms+$3)*100) "%"}'

Memory set usage :

db2 "select substr(HOST_NAME,1,25) as HOST_NAME, substr(DB_NAME,1,10) as DBNAME, MEMORY_SET_TYPE, MEMORY_SET_ID, MEMORY_SET_SIZE, MEMORY_SET_COMMITTED, MEMORY_SET_USED, MEMORY_SET_USED_HWM from table(MON_GET_MEMORY_SET(null,null,-2)) as t with ur"

Memory pools usage :

db2 "select substr(HOST_NAME,1,25) as HOST_NAME, substr(DB_NAME,1,10) as DBNAME, MEMORY_SET_TYPE, MEMORY_POOL_TYPE, MEMORY_POOL_ID, APPLICATION_HANDLE, EDU_ID, MEMORY_POOL_USED, MEMORY_POOL_USED_HWM from table(MON_GET_MEMORY_POOL(null,null,-2)) as t with ur"

Locks (LOCKLIST/MAXLOCKS) :

db2 "select name, substr(value,1,16) value, value_flags, substr(deferred_value,1,16) deferred_value, deferred_value_flags, substr(datatype,1,16) datatype from SYSIBMADM.DBCFG where name='locklist' with ur"

How to Interpret DB2 Hex codes in db2diag.log :

    Verify the calculator is set to View->Scientific
    Select Hex
    Enter 8 F's
    Enter '-'
    Enter 'FFFFE60A'
    Enter '=' will result in 19F5
    Enter '+'
    Enter '1'
    Enter '=' will result in 19F6
    Select Dec will result in 6646

Cleaning all IPC semaphores :

ipcs | grep db2inst1 | awk '{print " ipcrm -"$1" "$2 }' > ipc_remove.ksh
chmod u+x ipc_remove.ksh;./ipc-remove.ksh

Collecting support data :

db2support . -d <database name> -c -cl 1 -f -sf MYSQL.sql

Taking (wrapped) trace of statement execution :

db2 -x "values application_id()"
<*LOCAL.smkivell.141129222232>
db2trc on -l 1G -appid <*LOCAL.smkivell.141129222232>
db2 "<STATEMENT>"
db2trc dump trace.dmp
db2trc off
db2trc flw trace.dmp trace.flw
db2trc fmt trace.dmp trace.fmt

Taking full dump to file for current statement :

db2 -x "values application_id()"
<*LOCAL.smkivell.141129222232>
db2trc on -f trace.dmp -appid <*LOCAL.smkivell.141129222232>
db2 "<STATEMENT>"
db2trc off
db2trc flw trace.dmp trace.flw
db2trc fmt trace.dmp trace.fmt

Catching SQL error in db2diag.log :

db2pdcfg -catch <-802>
db2 "<STATEMENT>
db2pdcfg -catch clear

Callout script for SQL error :

  • Create script db2cos and move it into ~/sqllib

#!/bin/sh

# DB2 Callout script to collect information for SQL1224

echo "Collecting information for SQL1224">>db2cos.rpt
echo "db2cos script executed at `date`" >>db2cos.rpt
db2pd -agents >>db2cos.rpt

  • Start catching of specified error

db2pd -catch clear all
db2pd -catch -1224

Collect table's PD (Packed Descriptor) area :

DB2SVCPW=FXMPTRMT; export DB2SVCPW
DB2CAT=SERVICE;export DB2CAT
db2cat -d <DB> -n <TABLE> -s <SCHEMA> -f current_<TABLENAME>.pd -o db2cat_c_<TABLENAME>.out

Flush bufferpoools to disk :

db2pdcfg -flushbp

Configure TCP Keep Alive :

Let's say for example you do 'db2set DB2TCP_SERVER_KEEPALIVE_TIMEOUT=12000'.

TCP_KEEPIDLE will be set to '12000 / 3 = 4000' (in seconds).
This means that we will start sending 'keepalive' probes after the
connection has been idle for 4000 seconds.

TCP_KEEPINTVL will be set to '(12000 - 4000) / 10 = 800' meaning that
a 'keepalive' probe will be sent every 800 seconds.

TCP_KEEPCNT will be set to 10 meaning that a connection will be dropped
after 10 unresponsive 'keepalive' probes.
Analyze DB structure with db2dart - DB MUST BE DEACTIVATED ! :
db2dart <DATABASE> /T /TSI <TBSP_ID> /OI <OBJECTID> /RPT . /RPTN <OUTPUTNAME>

Trace specific query :

QUERYFILE=$1

# first connect to the db
db2 connect to DBDWHR
db2 "set current degree = '4' "

# figure out what application handle I am
MON_OUTPUT=$(db2 "values(mon_get_application_handle)")
echo $MON_OUTPUT
APPHDL=$(echo $MON_OUTPUT | awk '{print $3}')
echo "My application handle is $APPHDL"

# turn on trace for ri, dms and ixm
echo
TRCCMD="db2trc on -f trace.dmp -t -m \"SQLRI, SQLD, SQLI\" -apphdl $APPHDL"
echo "Turning on trace:"
echo "$TRCCMD"
echo
eval $TRCCMD

echo
echo "Executing query now:"
db2 -tvf $QUERYFILE
echo

echo "turning trace off"
db2trc off

echo
echo "formatting trace flw"
db2trc flw -rds -t trace.dmp trace.flw
echo
echo "formatting trace fmt"
db2trc fmt trace.dmp trace.fmt

echo
echo "please grab trace.dmp, trace.flw, and trace.fmt"
echo
db2 terminate

Sort heap monitoring :

The easiest way to monitor for this condition is by using the
snap_get_agent_memory_pool routine as follows :
db2 "SELECT AGENT_ID, AGENT_PID, DBPARTITIONNUM, POOL_ID,
POOL_CUR_SIZE FROM
TABLE(SNAP_GET_AGENT_MEMORY_POOL('SAMPLE',-2)) A WHERE
POOL_ID='SORT' AND POOL_CUR_SIZE > 20000". Replacing 'SAMPLE'
with the database name and '20000' with a value larger than
SORTHEAP, eg. 2 * SORTHEAP.  When using STMM, the value may have
to be larger - monitor the db2diag.log for normal SORTHEAP
settings and use a value above that.

Show me users which source DB2 profiles :

for user in `awk -F: '{print $1}' /etc/passwd`; do grep -i db2 ~${user}/.profile 2>/dev/null && echo $user ; done

Checking RID/RID_BIT of a specific rows + delete duplicates with it :

db2 "select int(rid()/65536) as page,mod(rid(),65536) as slot,key,data from ridtest" 

db2 "delete from ridtest where rid() = (0*65536+4)"
db2 "select int(rid()/65536) as page,mod(rid(),65536) as slot,key,data from ridtest"

DB2 Global registry is in /var/db2 (or show it with db2greg -g)

Dump Global registry :

db2greg -dump
db2greg -dump -v

Add new instance into DB2 Global registry :

db2greg -addinstrec service=DB2,instancename=db2inst2

Change instance path :

db2greg -addinstrec service=DB2,instancename=db2inst2

Index corruption :

db2 "select TBSPACEID, IID, INDEX_OBJECTID from syscat.indexes where tabname='<TABNAME>'"

db2dart <DBNAME> /DI /TSI <TBSPACEID> /OI <INDEX_OBJECTID> /PS 0 /NP 1 /V Y

Table corruption - SQL1477N (nonrecoverable load) :

db2dart <DBNAME> /DDEL 
 -> insert <TABLEID>,<TBSPACEID>,<FIRSTPAGE>,<NUMBER OF PAGES>

Inspect table :

db2 "inspect check table name TEST results keep inspect.txt"

db2inspf inspect.txt inspect_TEST.txt

Datapage corruption : type:0 corruption → IBM Support type:1 corruption → db2dart : type:>128 → index corruption:

Inspect DB for corruption with db2dart :

For the entire database: db2dart <dbname> /d
For the tablespace: db2dart <dbname> /ts /tsi <tablespaceid>
For the table: db2dart <dbname> /t /oi <objectid> /tsi <tablespaceid> 

Terminate fenced process :

db2fmpterm <pid_of_db2acd>

IOCP (AIX):

smitty iocp
lsdev
mkdev ...

See AIX environment variables of the process :

ps ewww PID | tr ' ' '\012' | grep = | sort

Tracing opened files :

db2trc on -f trc.raw -Madd sqloopenp -Madd sqloclose

List opened files :

AIX:

# procfiles -n <pid>

SOLARIS:

# pfiles <pid>

LINUX:

# lsof -p <pid>

Rename database :

  • Prepare input file for db2relocatedb command :

DB_NAME=REPLICA1,REPLICA2
DB_PATH=/db2/db2inst1/data_01
INSTANCE=db2inst1

  • Rename the DB directory to new name
  • Execute db2relocatedb

db2relocatedb -f relocate_rename.txt

DB2 Fault monitor registry :

fm.<machinename>.reg.

db2fm -i <instancename> -I /60

DB2 SQL Error: SQLCODE=-805, SQLSTATE=51002, SQLERRMC=NULLID.SYSLH203 0X5359534C564C3031, DRIVER=4.12.55 :

db2 "select varchar(PKGSCHEMA,20) SCHEMA, PKGNAME FROM SYSCAT.PACKAGES WHERE PKGSCHEMA = 'NULLID' and PKGNAME like 'SYSLH20%' with ur"

db2 bind ~/sqllib/bnd/@db2cli.lst blocking all grant public sqlerror continue CLIPKG 5

  • db2_troubleshooting.txt
  • Last modified: 2019/10/18 20:04
  • by 127.0.0.1