oracle_goldengate_apache_flume

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
oracle_goldengate_apache_flume [2020/11/12 12:05] andonovjoracle_goldengate_apache_flume [2020/11/12 12:24] (current) – [Test] andonovj
Line 11: Line 11:
 In the target OGG for big data, create the following files In the target OGG for big data, create the following files
  
-<Code:bash|Create Flume.properties file (TRG_OGGHOME/dirprm/flume.properties)+<Code:bash|Create Flume.properties file (TRG_OGGHOME/dirprm/flume.properties)>
 gg.handlerlist = flume gg.handlerlist = flume
 gg.handler.flume.type=flume gg.handler.flume.type=flume
-gg.handler.flume.RpcClientPropertiesFile=custom-flumerpc.properties+gg.handler.flume.RpcClientPropertiesFile=custom-flume-rpc.properties
 gg.handler.flume.format=avro_op gg.handler.flume.format=avro_op
 gg.handler.flume.mode=op gg.handler.flume.mode=op
Line 24: Line 24:
 javawriter.stats.display=TRUE javawriter.stats.display=TRUE
 javawriter.stats.full=TRUE javawriter.stats.full=TRUE
 +
 gg.log=log4j gg.log=log4j
 gg.log.level=INFO gg.log.level=INFO
 +
 gg.report.time=30sec gg.report.time=30sec
 +
 gg.classpath=dirprm/:/opt/flume/lib/* gg.classpath=dirprm/:/opt/flume/lib/*
 jvm.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar jvm.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar
 +
 </Code> </Code>
  
Line 43: Line 47:
 </Code> </Code>
  
 +====Start Flume====
 +<Code:bash|Start Flume>
 +[oracle@edvmr1p0 conf]$ hdfs dfs -mkdir /user/oracle/flume
 +[oracle@edvmr1p0 conf]$ flume-ng agent --conf /opt/flume/conf -f /opt/flume/conf/flume.conf -Dflume.root.logger=DEBUG,LOGFILE -n agent1 -Dorg.apache.flume.log.rawdata=true
 +Info: Sourcing environment configuration script /opt/flume/conf/flume-env.sh
 +Info: Including Hadoop libraries found via (/opt/hadoop/bin/hadoop) for HDFS access
 +Info: Including HBASE libraries found via (/opt/hbase/bin/hbase) for HBASE access
 ++ exec /usr/java/latest/bin/java -Xms100m -Xmx2000m -Dcom.sun.management.jmxremote -Dflume.root.logger=DEBUG,LOGFILE -Dorg.apache.flume.log.rawdata=true -cp '/opt/flume/conf:/opt/flume/lib/*:/opt/hadoop/etc/hadoop:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/hdfs:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/yarn/lib/*:/opt/hadoop/share/hadoop/yarn/*:/opt/hadoop/share/hadoop/mapreduce/lib/*:/opt/hadoop/share/hadoop/mapreduce/*:/opt/hadoop/contrib/capacity-scheduler/*.jar:/opt/hbase/conf:/usr/java/latest/lib/tools.jar:/opt/hbase:/opt/hbase/lib/activation-1.1.jar:/opt/hbase/lib/aopalliance-1.0.jar:/opt/hbase/lib/apacheds-i18n-2.0.0-M15.jar:/opt/hbase/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/opt/hbase/lib/api-asn1-api-1.0.0-M2
 +</Code>
  
 ====Configure the GG Replicat==== ====Configure the GG Replicat====
Line 94: Line 107:
  
 </Code> </Code>
 +
 +=====Test=====
 +To test replication, we will simply insert some records and see if we can see them replicated in the log
 +
 +<Code:bash|Insert rows>
 +[oracle@edvmr1p0 oggsrc]$ sqlplus oggsrc/oracle@orcl
 +
 +SQL*Plus: Release 12.1.0.2.0 Production on Thu Nov 12 12:12:27 2020
 +
 +Copyright (c) 1982, 2014, Oracle.  All rights reserved.
 +
 +Last Successful login time: Thu Nov 12 2020 11:40:44 +00:00
 +
 +Connected to:
 +Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
 +With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
 +
 +SQL> insert into customer_prod select * from customer where customer_id < 21;
 +
 +20 rows created.
 +
 +SQL> commit;
 +
 +Commit complete.
 +
 +SQL> 
 +</Code>
 +
 +Then, we can check the stats from the GoldenGate
 +
 +<Code:bash|Check GoldenGate Stats>
 +--Source (Extract)
 +GGSCI (edvmr1p0) 5> send priex, stats
 +
 +Sending STATS request to EXTRACT PRIEX ...
 +
 +Start of Statistics at 2020-11-12 12:13:16.
 +
 +DDL replication statistics (for all trails):
 +
 +*** Total statistics since extract started     ***
 + Operations                           6.00
 +
 +Output to ./dirdat/in:
 +
 +Extracting from OGGSRC.CUSTOMER_PROD to OGGSRC.CUSTOMER_PROD:
 +
 +*** Total statistics since 2020-11-12 12:13:12 ***
 + Total inserts                              20.00
 + Total updates                               0.00
 + Total deletes                               0.00
 + Total discards                             0.00
 + Total operations                          20.00
 +
 +*** Daily statistics since 2020-11-12 12:13:12 ***
 + Total inserts                              20.00
 + Total updates                               0.00
 + Total deletes                               0.00
 + Total discards                             0.00
 + Total operations                          20.00
 +
 +*** Hourly statistics since 2020-11-12 12:13:12 ***
 + Total inserts                              20.00
 + Total updates                               0.00
 + Total deletes                               0.00
 + Total discards                             0.00
 + Total operations                          20.00
 +
 +*** Latest statistics since 2020-11-12 12:13:12 ***
 + Total inserts                              20.00
 + Total updates                               0.00
 + Total deletes                               0.00
 + Total discards                             0.00
 + Total operations                          20.00
 +
 +End of Statistics.
 +
 +
 +GGSCI (edvmr1p0) 6> 
 +
 +
 +--Target (Replicat)
 +GGSCI (edvmr1p0) 23> send rflume, stats
 +
 +Sending STATS request to REPLICAT RFLUME ...
 +
 +Start of Statistics at 2020-11-12 12:13:26.
 +
 +Replicating from OGGSRC.CUSTOMER_PROD to OGGTRG.CUSTOMER_PROD:
 +
 +*** Total statistics since 2020-11-12 12:13:14 ***
 + Total inserts                              20.00
 + Total updates                               0.00
 + Total deletes                               0.00
 + Total discards                             0.00
 + Total operations                          20.00
 +
 +*** Daily statistics since 2020-11-12 12:13:14 ***
 + Total inserts                              20.00
 + Total updates                               0.00
 + Total deletes                               0.00
 + Total discards                             0.00
 + Total operations                          20.00
 +
 +*** Hourly statistics since 2020-11-12 12:13:14 ***
 + Total inserts                              20.00
 + Total updates                               0.00
 + Total deletes                               0.00
 + Total discards                             0.00
 + Total operations                          20.00
 +
 +*** Latest statistics since 2020-11-12 12:13:14 ***
 + Total inserts                              20.00
 + Total updates                               0.00
 + Total deletes                               0.00
 + Total discards                             0.00
 + Total operations                          20.00
 +
 +End of Statistics.
 +
 +
 +GGSCI (edvmr1p0) 24> 
 +</Code>
 +
 +Now, we can also check on the flume:
 +
 +<Code:bash|Check Apache Flume log>
 +[New I/O worker #1]
 +(org.apache.flume.source.AvroSource.appendBatch:378) - Avro source avro-source1: Received avro event batch of 1 events. [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.LoggerSink.process:95) - Event: {headers:{SCHEMA_NAME=OGGTRG, TABLE_NAME=CUSTOMER_PROD,SCHEMA_FINGERPRINT=1668461282719043198} body: 28 4F 47 47 54 52 47 2E 43 55 53 54 4F 4D 45 52 (OGGTRG.CUSTOMER 
 +</Code>
 +
 +We can of course check the files on the HDFS as well:
 +
 +<Code:bash|Check HDFS>
 +[oracle@edvmr1p0 oggsrc]$ hdfs dfs -ls /user/oracle/flume
 +Found 1 items
 +-rw-r--r--   1 oracle supergroup       2460 2020-11-12 12:13 /user/oracle/flume/FlumeData.1605183195022
 +[oracle@edvmr1p0 oggsrc]$ 
 +[oracle@edvmr1p0 oggsrc]$ hdfs dfs -cat /user/oracle/flume/FlumeData.1605183195022 | head -50
 +{
 +  "type" : "record",
 +  "name" : "CUSTOMER_PROD",
 +  "namespace" : "OGGTRG",
 +  "fields" : [ {
 +    "name" : "table",
 +    "type" : "string"
 +  }, {
 +    "name" : "op_type",
 +    "type" : "string"
 +  }, {
 +    "name" : "op_ts",
 +    "type" : "string"
 +  }, {
 +    "name" : "current_ts",
 +    "type" : "string"
 +  }, {
 +    "name" : "pos",
 +    "type" : "string"
 +  }, {
 +    "name" : "primary_keys",
 +
 +</Code>
 +
 +Niiice, so we have replicated 20 rows between: Oracle RDBMS -> Golden Gate (Extract) -> Golden Gate (Replicat) -> Apache Flume -> HDFS
 +
  
 =====Appendix===== =====Appendix=====
  • oracle_goldengate_apache_flume.1605182717.txt.gz
  • Last modified: 2020/11/12 12:05
  • by andonovj