GoldenGate for Big Data — Replication to Oracle Object Storage
Oracle Object Storage allow customers to store any type of files in any format. It can be used as a central source of data for Data Science, ETLs and Data Warehouses projects.
For this article we want to store data from one source (MySQL) into Object Storage. To have data in real time we need to deploy 2 GoldenGate Instances. One for GoldenGate for non-oracle (MySQL) and other one GoldenGate for Big Data.
I’m using GoldenGate Marketplace images. If you don’t have access to Oracle Cloud Infrastructure (OCI) you can create free account and explore the different services.
If your source is not MySQL, it’s not a problem you can follow this tutorial and do exactly the same configuration.
Pre-requisites
- GoldenGate for MySQL up and running and with one Extract configured
- GoldenGate for Big Data up and running
Preparing GoldenGate for Big Data
You have 2 main requisites to have GoldenGate working with Oracle Object Storage:
- Credentials to OCI
- Install Java SDK
In case you have already OCI credentials created, you just need to copy and paste on Config file and skip this step. If you don’t have, keep reading.
First we need to generate an API signing key. On GoldenGate Big Data instance:
mkdir .oci#Generate SSH key
openssl genrsa -out ~/.oci/oci_api_key.pemchmod go-rwx ~/.oci/oci_api_key.pemopenssl rsa -pubout -in ~/.oci/oci_api_key.pem -out ~/.oci/oci_api_key_public.pemcat ~/.oci/oci_api_key_public.pem
#copy the public key
Go to your OCI account and on top right corner click on profile image and after on your user:
Click on Add API Keys and paste the public key, when you add it will show a text like this one:
[DEFAULT]
user=ocid1.user.oc1..aaa....
fingerprint=4d:4a:...
tenancy=ocid1.tenancy.oc1..aaaa...
region=eu-frankfurt-1
key_file=#TO DO
Copy the text and go back to GoldenGate for Big Data and add it to a new file (Config):
-bash-4.2$ pwd
/home/opc/.ocivi config
#paste
[DEFAULT]
user=ocid1.user.oc1..aaaaa..
fingerprint=4d:4a:...
tenancy=ocid1.tenancy.oc1..aa...
region=eu-frankfurt-1
key_file=/home/opc/.oci/oci_api_key_public.pem
Next step is to install Java SDK:
sudo yum-config-manager --enable ol7_developer
sudo yum install java-oci-sdk
Done.
Sending Data from MySQL to Object Storage
GoldenGate Big Data Certificates
In order to give access to GoldenGate MySQL to send data to GoldenGate for Big Data we need to configure the certificates on MySQL. First, let’s collect the certificates from Big Data instance. Access to GoldenGate Big Data and click on the lock next to “https”,more information and “View Certificate” (I’m using Firefox, could be different for others browsers). More detailed info about this configuration, on this great article of Sydney Nurse — Replication Paths between Oracle GoldenGate MSA Marketplace
Download these 2 certificates:
Go to GG Big Data instance and copy the certificate from /etc/nginx/ogg.pem.
Still on GG Big Data instance, create a new user for the distribution path.
Configuration GoldenGate for MySQL
Start by editing the /etc/hosts and add an entry for GG Big Data instance like this one:
#you can copy and past from GG Big Data instance hosts file:<private ip> ogg21cbigdata.sub12151446000.vcneloi.oraclevcn.com ogg21cbigdata
Go to you GG MySQL instance and click on Certificate Management:
Add a new Client certificate:
Client Certificates
- Certificate PEM — Certificate ending in “chain”
- Private-key — ogg.pem
- CA Certificates — without chain
You must upload all.
For CA Certificate, add the one ending in “chain”.
Now, we need to add the credential on GG MySQL to be used by distribution path. Add the same username and password as you did on GG Big Data.
Go to Distribution Service and add a new distribution path like this one:
Now, it should be created and running:
Creating the Replicat to Object Storage
Go to GG for Big Data and click on add replicat, choose “Classic Replicat”:
On parameters file, pick the schema from your source and as a target it can be whatever you. It will create a folder on Object Storage with that name:
On properties add this configuration:
# Properties file for Replicat OBJ
#File Writer Handler Template
gg.handlerlist=filewriter
gg.handler.filewriter.type=filewriter
gg.handler.filewriter.mode=op
gg.handler.filewriter.pathMappingTemplate=./dirout
gg.handler.filewriter.stateFileDirectory=./dirsta
gg.handler.filewriter.fileNameMappingTemplate=${fullyQualifiedTableName}_${currentTimestamp}.json
gg.handler.filewriter.inactivityRollInterval=10s
gg.handler.filewriter.finalizeAction=none
gg.handler.filewriter.format=json
gg.handler.filewriter.format.metaColumnsTemplate=${objectname[table]},${optype[op_type]},${timestamp[op_ts]},${currenttimestamp[current_ts]},${position[pos]}
gg.handler.filewriter.partitionByTable=true
gg.handler.filewriter.rollOnShutdown=true
gg.handler.filewriter.eventHandler=oci#The OCI Event handler
gg.eventhandler.oci.type=oci
#TODO: Set the OCI properties
gg.eventhandler.oci.configFilePath=~/.oci/config
gg.eventhandler.oci.profile=DEFAULT
gg.eventhandler.oci.namespace=<namespace of Object Storage>
gg.eventhandler.oci.region=eu-frankfurt-1 #your region
gg.eventhandler.oci.compartmentID=<compartmen ID where of Bucket>
gg.eventhandler.oci.pathMappingTemplate=${schemaName}
gg.eventhandler.oci.bucketMappingTemplate=<The bucket name>
gg.eventhandler.oci.fileNameMappingTemplate=${tableName}_${currentTimestamp}.txt
gg.eventhandler.oci.finalizeAction=NONE
goldengate.userexit.writers=javawriter
#TODO: Uncomment and set the proxy settings if required.
#gg.eventhandler.oci.proxyServer=
#gg.eventhandler.oci.proxyPort=
#TODO: Set the OCI client libraries in the classpath.#VERY IMPORTANT - include the JAR files of the Oracle Cloud #Infrastructure Java SDK
gg.classpath=/usr/lib64/java-oci-sdk/third-party/lib/*:/usr/lib64/java-oci-sdk/lib/*
jvm.bootoptions=-Xmx512m -Xms32m
The parameters “gg.handler.filewriter.inactivityRollInterval” it’s very important since it will create a new file 10 seconds after the last record inserted. You can find more information on this blog post written by Deniz Sendil.
As you can see the files are being created on Object Storage:
I hope this can help you. If you have any doubt, please reach out to me through LinkedIn or Medium.