oracle_goldengate_hadoop_hdfs

This is an old revision of the document!


Create HDFS Properties File (TRG_GGHOME/dirprm/hdfs.properties)

gg.handlerlist=hdfs
gg.handler.hdfs.type=hdfs
gg.handler.hdfs.includeTokens=false
gg.handler.hdfs.maxFileSize=1g
gg.handler.hdfs.rootFilePath=/user/oracle
gg.handler.hdfs.fileRollInterval=0
gg.handler.hdfs.inactivityRollInterval=0
gg.handler.hdfs.fileSuffix=.json
gg.handler.hdfs.partitionByTable=true
gg.handler.hdfs.rollOnMetadataChange=true
gg.handler.hdfs.authType=none
gg.handler.hdfs.format=json_row
gg.handler.hdfs.format.prettyPrint=true
gg.handler.hdfs.format.jsonDelimiter=CDATA[]
gg.handler.hdfs.format.pkUpdateHandling=delete-insert
gg.handler.hdfs.format.generateSchema=false
gg.handler.hdfs.mode=tx
goldengate.userexit.timestamp=utc
goldengate.userexit.writers=javawriter
javawriter.stats.display=TRUE
javawriter.stats.full=TRUE
gg.log=log4j
gg.log.level=INFO
gg.report.time=30sec
gg.classpath=/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hadoop/etc/hadoop/:
jvm.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar

Create Replication

GGSCI (edvmr1p0) 3> edit param hdfsrp

REPLICAT hdfsrp
TARGETDB LIBFILE libggjava.so SET property=dirprm/hdfs.properties
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 10000
MAP oggsrc.*, target oggsrc.*;


GGSCI (edvmr1p0) 4> add replicat hdfsrp, exttrail ./dirdat/hd
REPLICAT added.
GGSCI (edvmr1p0) 5> start hdfsrp

Sending START request to MANAGER ...
REPLICAT HDFSRP starting


GGSCI (edvmr1p0) 6> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     HDFSRP      00:00:00      00:00:08    

To test, we can connect to the Source database and issue insert:

Insert Data

[oracle@hostname ~]$ sqlplus oggsrc/<pwd>@orcl
SQL*Plus: Release 12.1.0.2.0 Production on Sat Feb 4 16:04:27
2017
Copyright (c) 1982, 2014, Oracle. All rights reserved.
Last Successful login time: Sat Feb 04 2017 16:01:11 +11:00
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 -
With the Partitioning, OLAP, Advanced Analytics and Real
Application Testing options
SQL> insert into customer_prod select * from customer where
customer_id < 1001;
1000 rows created.
SQL> commit;
Commit complete.
SQL>

Check the replication:

Check source and dest statistics

--Check Extract
GGSCI (hostname) 2> send priex, stats
Sending STATS request to EXTRACT PRIEX ...
Start of Statistics at 2017-02-04 16:08:10.
Output to ./dirdat/in:
Extracting from OGGSRC.CUSTOMER_PROD to OGGSRC.CUSTOMER_PROD:
*** Total statistics since 2017-02-04 15:46:18 ***
Total inserts 1000.00
Total updates 0.00
Total deletes 0.00
Total discards 0.00
Total operations 1000.00
GGSCI (hostname) 7>

--Check Replicat
GGSCI (hostname) 6> send hdfsrp, stats
Sending STATS request to REPLICAT HDFSRP ...
Start of Statistics at 2017-02-04 16:11:27.
Replicating from OGGSRC.CUSTOMER_PROD to oggsrc.CUSTOMER_PROD:
… Many lines omitted for brevity…
*** Total statistics since 2017-02-04 16:05:11 ***
Total inserts 1000.00
Total updates 0.00
Total deletes 0.00
Total discards 0.00
Total operations 1000.00
GGSCI (hostname) 7>
  • oracle_goldengate_hadoop_hdfs.1605104624.txt.gz
  • Last modified: 2020/11/11 14:23
  • by andonovj