Connecting Kafka to Oracle Data Catalog

Eloi Lopes
2 min readDec 9, 2020

--

Today, I going to show how to connect Kafka to Oracle Data Catalog. Kafka is running on Oracle Big Data Service.

What is Oracle Data Catalog?

Oracle Cloud Infrastructure (OCI) Data Catalog is a metadata management service that helps data professionals discover data and support data governance.

To connect to Data Catalog, I’m using the public IPs of Big Data Cluster. To configure these IPs you need to create it manually. Check the documentation in case you haven’t created.

Configuring Kafka

Go to your Cloudera Manager and click on Kafka:

Go to instances and you will have two or more Kafka brokers. You need to add this configuration to each broker instance:

listeners=PLAINTEXT://0.0.0.0:9092
advertised.listeners=PLAINTEXT://<Private BROKER IP>:9092

Creating kafka connection to Data Catalog

On OCI Data Catalog, a connection is called Data Assets, so let’s create the kafka connection:

When you create the Data Asset the next step is to create the connection:

Give a name to your connection

HARVEST

Open the Kafka Data Asset and click on Harvest:

In the first option, select the connection you created.
Add the topics that you want to harvest

In the next step, create the job and check the log. The job should finish successfully. Now you can start using your kafka topics on OCI Data Catalog.

If you have any doubt, reach out to me through LinkedIn or Medium.

--

--

Eloi Lopes
Eloi Lopes

Written by Eloi Lopes

Opinions expressed are solely my own and do not express the views or opinions of my employer Oracle. https://www.linkedin.com/in/eloilopes/

No responses yet