This is an old revision of the document!


In this section we will replicate from Oracle → Golden Gate → HDFS

Create HDFS Properties File (TRG_GGHOME/dirprm/hdfs.properties)

gg.handlerlist=hdfs
gg.handler.hdfs.type=hdfs
gg.handler.hdfs.includeTokens=false
gg.handler.hdfs.maxFileSize=1g
gg.handler.hdfs.rootFilePath=/user/oracle
gg.handler.hdfs.fileRollInterval=0
gg.handler.hdfs.inactivityRollInterval=0
gg.handler.hdfs.fileSuffix=.json
gg.handler.hdfs.partitionByTable=true
gg.handler.hdfs.rollOnMetadataChange=true
gg.handler.hdfs.authType=none
gg.handler.hdfs.format=json_row
gg.handler.hdfs.format.prettyPrint=true
gg.handler.hdfs.format.jsonDelimiter=CDATA[]
gg.handler.hdfs.format.pkUpdateHandling=delete-insert
gg.handler.hdfs.format.generateSchema=false
gg.handler.hdfs.mode=tx
goldengate.userexit.timestamp=utc
goldengate.userexit.writers=javawriter
javawriter.stats.display=TRUE
javawriter.stats.full=TRUE
gg.log=log4j
gg.log.level=INFO
gg.report.time=30sec
gg.classpath=/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hadoop/etc/hadoop/:
jvm.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar

Create Replication

GGSCI (edvmr1p0) 3> edit param hdfsrp

REPLICAT hdfsrp
TARGETDB LIBFILE libggjava.so SET property=dirprm/hdfs.properties
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 10000
MAP oggsrc.*, target oggsrc.*;


GGSCI (edvmr1p0) 4> add replicat hdfsrp, exttrail ./dirdat/hd
REPLICAT added.
GGSCI (edvmr1p0) 5> start hdfsrp

Sending START request to MANAGER ...
REPLICAT HDFSRP starting


GGSCI (edvmr1p0) 6> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     HDFSRP      00:00:00      00:00:08    

To test, we can connect to the Source database and issue insert:

Insert Data

[oracle@hostname ~]$ sqlplus oggsrc/<pwd>@orcl
SQL*Plus: Release 12.1.0.2.0 Production on Sat Feb 4 16:04:27
2017
Copyright (c) 1982, 2014, Oracle. All rights reserved.
Last Successful login time: Sat Feb 04 2017 16:01:11 +11:00
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 -
With the Partitioning, OLAP, Advanced Analytics and Real
Application Testing options
SQL> insert into customer_prod select * from customer where
customer_id < 1001;
1000 rows created.
SQL> commit;
Commit complete.
SQL>

Check the replication:

Check source and dest statistics

--Check Extract
GGSCI (hostname) 2> send priex, stats
Sending STATS request to EXTRACT PRIEX ...
Start of Statistics at 2017-02-04 16:08:10.
Output to ./dirdat/in:
Extracting from OGGSRC.CUSTOMER_PROD to OGGSRC.CUSTOMER_PROD:
*** Total statistics since 2017-02-04 15:46:18 ***
Total inserts 1000.00
Total updates 0.00
Total deletes 0.00
Total discards 0.00
Total operations 1000.00
GGSCI (hostname) 7>

--Check Replicat
GGSCI (hostname) 6> send hdfsrp, stats
Sending STATS request to REPLICAT HDFSRP ...
Start of Statistics at 2017-02-04 16:11:27.
Replicating from OGGSRC.CUSTOMER_PROD to oggsrc.CUSTOMER_PROD:
… Many lines omitted for brevity…
*** Total statistics since 2017-02-04 16:05:11 ***
Total inserts 1000.00
Total updates 0.00
Total deletes 0.00
Total discards 0.00
Total operations 1000.00
GGSCI (hostname) 7>

We can check in the HDFS as well:

Check in the HDFS

[oracle@edvmr1p0 oggsrc]$ hdfs dfs ls /user/oracle
Found 1 items
drwxr-xr-x   - oracle supergroup          0 2020-11-11 14:06 /user/oracle/oggsrc.customer_prod
[oracle@edvmr1p0 oggsrc]$ hdfs dfs -ls /user/oracle/oggsrc.customer_prod
Found 1 items
-rw-r--r--   1 oracle supergroup     449586 2020-11-11 14:06 /user/oracle/oggsrc.customer_prod/oggsrc.customer_prod_2020-11-11_14-06-11.144.json
[oracle@edvmr1p0 oggsrc]$ hdfs dfs -cat /user/oracle/oggsrc.customer_prod/oggsrc.customer_prod_2020-11-11_14-06-11.144.json | head -100
{
    "table":"OGGSRC.CUSTOMER_PROD",
    "op_type":"I",
    "op_ts":"2020-11-11 14:06:05.000168",
    "current_ts":"2020-11-11T14:06:11.134000",
    "pos":"00000000000000002372",
    "CUSTOMER_ID":1,
    "FIRST_NAME":"Jaden",
    "LAST_NAME":"Poole",
    "EMAIL_ADDRESS":"[email protected]",
    "SSN":"16060407-7404",
    "ADDRESS":"3640 Ac St.",
    "CITY":"San Juan de Dios",
    "ZIP_CODE":"31014",
.............................................................
  • oracle_goldengate_hadoop_hdfs.1605104851.txt.gz
  • Last modified: 2020/11/11 14:27
  • by andonovj