Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
oracle_goldengate_apache_flume [2020/11/12 11:49] – [Appendix] andonovj | oracle_goldengate_apache_flume [2020/11/12 12:24] (current) – [Test] andonovj | ||
---|---|---|---|
Line 2: | Line 2: | ||
Flume is a distributed, | Flume is a distributed, | ||
+ | |||
+ | We have to configure: | ||
+ | |||
+ | * Flume.properties file | ||
+ | * Flume RPC client properties file | ||
+ | |||
+ | ====Configure Flume Properties==== | ||
+ | In the target OGG for big data, create the following files | ||
+ | |||
+ | < | ||
+ | gg.handlerlist = flume | ||
+ | gg.handler.flume.type=flume | ||
+ | gg.handler.flume.RpcClientPropertiesFile=custom-flume-rpc.properties | ||
+ | gg.handler.flume.format=avro_op | ||
+ | gg.handler.flume.mode=op | ||
+ | gg.handler.flume.EventMapsTo=op | ||
+ | gg.handler.flume.PropagateSchema=true | ||
+ | gg.handler.flume.includeTokens=false | ||
+ | goldengate.userexit.timestamp=utc | ||
+ | goldengate.userexit.writers=javawriter | ||
+ | javawriter.stats.display=TRUE | ||
+ | javawriter.stats.full=TRUE | ||
+ | |||
+ | gg.log=log4j | ||
+ | gg.log.level=INFO | ||
+ | |||
+ | gg.report.time=30sec | ||
+ | |||
+ | gg.classpath=dirprm/:/ | ||
+ | jvm.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava/ | ||
+ | |||
+ | </ | ||
+ | |||
+ | ====Configure Flume RPC Client==== | ||
+ | To configure the client, we have to create custom-flume-rpc.properties in the same file: | ||
+ | |||
+ | < | ||
+ | client.type=default | ||
+ | hosts=h1 | ||
+ | hosts.h1=localhost: | ||
+ | batch-size=100 | ||
+ | connect-timeout=20000 | ||
+ | request-timeout=20000 | ||
+ | </ | ||
+ | |||
+ | ====Start Flume==== | ||
+ | < | ||
+ | [oracle@edvmr1p0 conf]$ hdfs dfs -mkdir / | ||
+ | [oracle@edvmr1p0 conf]$ flume-ng agent --conf / | ||
+ | Info: Sourcing environment configuration script / | ||
+ | Info: Including Hadoop libraries found via (/ | ||
+ | Info: Including HBASE libraries found via (/ | ||
+ | + exec / | ||
+ | </ | ||
+ | |||
+ | ====Configure the GG Replicat==== | ||
+ | Again, we have to configure the GG with the replicat: | ||
+ | |||
+ | < | ||
+ | [oracle@edvmr1p0 dirprm]$ trg | ||
+ | [oracle@edvmr1p0 oggtrg]$ ggsci | ||
+ | |||
+ | Oracle GoldenGate Command Interpreter | ||
+ | Version 12.2.0.1.160823 OGGCORE_OGGADP.12.2.0.1.0_PLATFORMS_161019.1437 | ||
+ | Linux, x64, 64bit (optimized), | ||
+ | Operating system character set identified as UTF-8. | ||
+ | |||
+ | Copyright (C) 1995, 2016, Oracle and/or its affiliates. All rights reserved. | ||
+ | |||
+ | |||
+ | |||
+ | GGSCI (edvmr1p0) 1> info all | ||
+ | |||
+ | Program | ||
+ | |||
+ | MANAGER | ||
+ | |||
+ | |||
+ | GGSCI (edvmr1p0) 2> edit param rflume | ||
+ | |||
+ | REPLICAT rflume | ||
+ | TARGETDB LIBFILE libggjava.so SET property=dirprm/ | ||
+ | REPORTCOUNT EVERY 1 MINUTES, RATE | ||
+ | GROUPTRANSOPS 10000 | ||
+ | MAP OGGSRC.*, TARGET OGGTRG.*; | ||
+ | :wq | ||
+ | |||
+ | GGSCI (edvmr1p0) 3> add replicat rflume, exttrail ./dirdat/fl | ||
+ | REPLICAT added. | ||
+ | |||
+ | |||
+ | GGSCI (edvmr1p0) 4> start rflume | ||
+ | |||
+ | Sending START request to MANAGER ... | ||
+ | REPLICAT RFLUME starting | ||
+ | |||
+ | |||
+ | GGSCI (edvmr1p0) 5> info all | ||
+ | |||
+ | Program | ||
+ | |||
+ | MANAGER | ||
+ | REPLICAT | ||
+ | |||
+ | </ | ||
+ | |||
+ | =====Test===== | ||
+ | To test replication, | ||
+ | |||
+ | < | ||
+ | [oracle@edvmr1p0 oggsrc]$ sqlplus oggsrc/ | ||
+ | |||
+ | SQL*Plus: Release 12.1.0.2.0 Production on Thu Nov 12 12:12:27 2020 | ||
+ | |||
+ | Copyright (c) 1982, 2014, Oracle. | ||
+ | |||
+ | Last Successful login time: Thu Nov 12 2020 11:40:44 +00:00 | ||
+ | |||
+ | Connected to: | ||
+ | Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production | ||
+ | With the Partitioning, | ||
+ | |||
+ | SQL> insert into customer_prod select * from customer where customer_id < 21; | ||
+ | |||
+ | 20 rows created. | ||
+ | |||
+ | SQL> commit; | ||
+ | |||
+ | Commit complete. | ||
+ | |||
+ | SQL> | ||
+ | </ | ||
+ | |||
+ | Then, we can check the stats from the GoldenGate | ||
+ | |||
+ | < | ||
+ | --Source (Extract) | ||
+ | GGSCI (edvmr1p0) 5> send priex, stats | ||
+ | |||
+ | Sending STATS request to EXTRACT PRIEX ... | ||
+ | |||
+ | Start of Statistics at 2020-11-12 12:13:16. | ||
+ | |||
+ | DDL replication statistics (for all trails): | ||
+ | |||
+ | *** Total statistics since extract started | ||
+ | Operations | ||
+ | |||
+ | Output to ./ | ||
+ | |||
+ | Extracting from OGGSRC.CUSTOMER_PROD to OGGSRC.CUSTOMER_PROD: | ||
+ | |||
+ | *** Total statistics since 2020-11-12 12:13:12 *** | ||
+ | Total inserts | ||
+ | Total updates | ||
+ | Total deletes | ||
+ | Total discards | ||
+ | Total operations | ||
+ | |||
+ | *** Daily statistics since 2020-11-12 12:13:12 *** | ||
+ | Total inserts | ||
+ | Total updates | ||
+ | Total deletes | ||
+ | Total discards | ||
+ | Total operations | ||
+ | |||
+ | *** Hourly statistics since 2020-11-12 12:13:12 *** | ||
+ | Total inserts | ||
+ | Total updates | ||
+ | Total deletes | ||
+ | Total discards | ||
+ | Total operations | ||
+ | |||
+ | *** Latest statistics since 2020-11-12 12:13:12 *** | ||
+ | Total inserts | ||
+ | Total updates | ||
+ | Total deletes | ||
+ | Total discards | ||
+ | Total operations | ||
+ | |||
+ | End of Statistics. | ||
+ | |||
+ | |||
+ | GGSCI (edvmr1p0) 6> | ||
+ | |||
+ | |||
+ | --Target (Replicat) | ||
+ | GGSCI (edvmr1p0) 23> send rflume, stats | ||
+ | |||
+ | Sending STATS request to REPLICAT RFLUME ... | ||
+ | |||
+ | Start of Statistics at 2020-11-12 12:13:26. | ||
+ | |||
+ | Replicating from OGGSRC.CUSTOMER_PROD to OGGTRG.CUSTOMER_PROD: | ||
+ | |||
+ | *** Total statistics since 2020-11-12 12:13:14 *** | ||
+ | Total inserts | ||
+ | Total updates | ||
+ | Total deletes | ||
+ | Total discards | ||
+ | Total operations | ||
+ | |||
+ | *** Daily statistics since 2020-11-12 12:13:14 *** | ||
+ | Total inserts | ||
+ | Total updates | ||
+ | Total deletes | ||
+ | Total discards | ||
+ | Total operations | ||
+ | |||
+ | *** Hourly statistics since 2020-11-12 12:13:14 *** | ||
+ | Total inserts | ||
+ | Total updates | ||
+ | Total deletes | ||
+ | Total discards | ||
+ | Total operations | ||
+ | |||
+ | *** Latest statistics since 2020-11-12 12:13:14 *** | ||
+ | Total inserts | ||
+ | Total updates | ||
+ | Total deletes | ||
+ | Total discards | ||
+ | Total operations | ||
+ | |||
+ | End of Statistics. | ||
+ | |||
+ | |||
+ | GGSCI (edvmr1p0) 24> | ||
+ | </ | ||
+ | |||
+ | Now, we can also check on the flume: | ||
+ | |||
+ | < | ||
+ | [New I/O worker #1] | ||
+ | (org.apache.flume.source.AvroSource.appendBatch: | ||
+ | </ | ||
+ | |||
+ | We can of course check the files on the HDFS as well: | ||
+ | |||
+ | < | ||
+ | [oracle@edvmr1p0 oggsrc]$ hdfs dfs -ls / | ||
+ | Found 1 items | ||
+ | -rw-r--r-- | ||
+ | [oracle@edvmr1p0 oggsrc]$ | ||
+ | [oracle@edvmr1p0 oggsrc]$ hdfs dfs -cat / | ||
+ | { | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | }, { | ||
+ | " | ||
+ | " | ||
+ | }, { | ||
+ | " | ||
+ | " | ||
+ | }, { | ||
+ | " | ||
+ | " | ||
+ | }, { | ||
+ | " | ||
+ | " | ||
+ | }, { | ||
+ | " | ||
+ | |||
+ | </ | ||
+ | |||
+ | Niiice, so we have replicated 20 rows between: Oracle RDBMS -> Golden Gate (Extract) -> Golden Gate (Replicat) -> Apache Flume -> HDFS | ||
+ | |||
=====Appendix===== | =====Appendix===== |