GoldenGate for Big Data — Replication to Oracle Object Storage

Eloi Lopes
5 min readMay 31, 2022

--

Oracle Object Storage allow customers to store any type of files in any format. It can be used as a central source of data for Data Science, ETLs and Data Warehouses projects.

For this article we want to store data from one source (MySQL) into Object Storage. To have data in real time we need to deploy 2 GoldenGate Instances. One for GoldenGate for non-oracle (MySQL) and other one GoldenGate for Big Data.

I’m using GoldenGate Marketplace images. If you don’t have access to Oracle Cloud Infrastructure (OCI) you can create free account and explore the different services.

If your source is not MySQL, it’s not a problem you can follow this tutorial and do exactly the same configuration.

Pre-requisites

  • GoldenGate for MySQL up and running and with one Extract configured
  • GoldenGate for Big Data up and running

Preparing GoldenGate for Big Data

You have 2 main requisites to have GoldenGate working with Oracle Object Storage:

  1. Credentials to OCI
  2. Install Java SDK

In case you have already OCI credentials created, you just need to copy and paste on Config file and skip this step. If you don’t have, keep reading.

First we need to generate an API signing key. On GoldenGate Big Data instance:

mkdir .oci#Generate SSH key
openssl genrsa -out ~/.oci/oci_api_key.pem
chmod go-rwx ~/.oci/oci_api_key.pemopenssl rsa -pubout -in ~/.oci/oci_api_key.pem -out ~/.oci/oci_api_key_public.pemcat ~/.oci/oci_api_key_public.pem
#copy the public key

Go to your OCI account and on top right corner click on profile image and after on your user:

Click on Add API Keys and paste the public key, when you add it will show a text like this one:

[DEFAULT]
user=ocid1.user.oc1..aaa....
fingerprint=4d:4a:...
tenancy=ocid1.tenancy.oc1..aaaa...
region=eu-frankfurt-1
key_file=#TO DO

Copy the text and go back to GoldenGate for Big Data and add it to a new file (Config):

-bash-4.2$ pwd
/home/opc/.oci
vi config
#paste
[DEFAULT]
user=ocid1.user.oc1..aaaaa..
fingerprint=4d:4a:...
tenancy=ocid1.tenancy.oc1..aa...
region=eu-frankfurt-1
key_file=/home/opc/.oci/oci_api_key_public.pem

Next step is to install Java SDK:

sudo yum-config-manager --enable ol7_developer
sudo yum install java-oci-sdk

Done.

Sending Data from MySQL to Object Storage

GoldenGate Big Data Certificates

In order to give access to GoldenGate MySQL to send data to GoldenGate for Big Data we need to configure the certificates on MySQL. First, let’s collect the certificates from Big Data instance. Access to GoldenGate Big Data and click on the lock next to “https”,more information and “View Certificate” (I’m using Firefox, could be different for others browsers). More detailed info about this configuration, on this great article of Sydney Nurse — Replication Paths between Oracle GoldenGate MSA Marketplace

Download these 2 certificates:

Go to GG Big Data instance and copy the certificate from /etc/nginx/ogg.pem.

Still on GG Big Data instance, create a new user for the distribution path.

Configuration GoldenGate for MySQL

Start by editing the /etc/hosts and add an entry for GG Big Data instance like this one:

#you can copy and past from GG Big Data instance hosts file:<private ip> ogg21cbigdata.sub12151446000.vcneloi.oraclevcn.com ogg21cbigdata

Go to you GG MySQL instance and click on Certificate Management:

Add a new Client certificate:

Client Certificates

  1. Certificate PEM — Certificate ending in “chain”
  2. Private-key — ogg.pem
  3. CA Certificates — without chain

You must upload all.

For CA Certificate, add the one ending in “chain”.

Now, we need to add the credential on GG MySQL to be used by distribution path. Add the same username and password as you did on GG Big Data.

Go to Distribution Service and add a new distribution path like this one:

Now, it should be created and running:

Creating the Replicat to Object Storage

Go to GG for Big Data and click on add replicat, choose “Classic Replicat”:

On parameters file, pick the schema from your source and as a target it can be whatever you. It will create a folder on Object Storage with that name:

On properties add this configuration:

# Properties file for Replicat OBJ
#File Writer Handler Template
gg.handlerlist=filewriter
gg.handler.filewriter.type=filewriter
gg.handler.filewriter.mode=op
gg.handler.filewriter.pathMappingTemplate=./dirout
gg.handler.filewriter.stateFileDirectory=./dirsta
gg.handler.filewriter.fileNameMappingTemplate=${fullyQualifiedTableName}_${currentTimestamp}.json
gg.handler.filewriter.inactivityRollInterval=10s
gg.handler.filewriter.finalizeAction=none
gg.handler.filewriter.format=json
gg.handler.filewriter.format.metaColumnsTemplate=${objectname[table]},${optype[op_type]},${timestamp[op_ts]},${currenttimestamp[current_ts]},${position[pos]}
gg.handler.filewriter.partitionByTable=true
gg.handler.filewriter.rollOnShutdown=true
gg.handler.filewriter.eventHandler=oci
#The OCI Event handler
gg.eventhandler.oci.type=oci
#TODO: Set the OCI properties
gg.eventhandler.oci.configFilePath=~/.oci/config
gg.eventhandler.oci.profile=DEFAULT
gg.eventhandler.oci.namespace=<namespace of Object Storage>
gg.eventhandler.oci.region=eu-frankfurt-1 #your region
gg.eventhandler.oci.compartmentID=<compartmen ID where of Bucket>
gg.eventhandler.oci.pathMappingTemplate=${schemaName}
gg.eventhandler.oci.bucketMappingTemplate=<The bucket name>
gg.eventhandler.oci.fileNameMappingTemplate=${tableName}_${currentTimestamp}.txt
gg.eventhandler.oci.finalizeAction=NONE
goldengate.userexit.writers=javawriter
#TODO: Uncomment and set the proxy settings if required.
#gg.eventhandler.oci.proxyServer=
#gg.eventhandler.oci.proxyPort=
#TODO: Set the OCI client libraries in the classpath.
#VERY IMPORTANT - include the JAR files of the Oracle Cloud #Infrastructure Java SDK
gg.classpath=/usr/lib64/java-oci-sdk/third-party/lib/*:/usr/lib64/java-oci-sdk/lib/*
jvm.bootoptions=-Xmx512m -Xms32m

The parameters “gg.handler.filewriter.inactivityRollInterval” it’s very important since it will create a new file 10 seconds after the last record inserted. You can find more information on this blog post written by Deniz Sendil.

As you can see the files are being created on Object Storage:

I hope this can help you. If you have any doubt, please reach out to me through LinkedIn or Medium.

--

--

Eloi Lopes
Eloi Lopes

Written by Eloi Lopes

Opinions expressed are solely my own and do not express the views or opinions of my employer Oracle. https://www.linkedin.com/in/eloilopes/

No responses yet