4.Talend Data Integration Database Operations-Overview

Talend DI offers connectors to Databases like SQL, Oracle, PostgreSQL, etc In this Talend tutorial, we will see the basic connectors that are required for database operations. All the examples are based on the Oracle database.

The complete documentation of Talend can be found here.

Wait a moment! If you are a newbie to Talend, then I will strongly recommend you first to go through the other tutorial posts about the Introduction and Installation of the Talend Open Studio.

Create a Oracle DB Connection[in Repository] and Retrieving the Schemas

Since the same connection will be used further to demonstrate, the connection is created in the Repository as part of Metadata as below.

Oracle Database is already installed as part of initial installation of Talend here.

STEP1:Create a connection under Metadata->Db Connections

STEP2:Provide the name of the database connection , Purpose and Description can be optional.

STEP3:Provide the Login,password details of the database. Also mention the port name ,Service name, Schema name and Test the connection and Save it

You can expand the Db Connections from the Repository and now you can see the ORADB connections created above.

STEP4-Now our connection is ready , we need to bring the table structures into Talend that are required further for building our jobs. Let’s see how to bring our table structures.

Click on Retrive Schema as below .

STEP5:A separate window will open as below. By default, the filter type is ‘%’ which means it brings all the tables from the schema, or you can provide filters here like table name in order to retrieve only that table.

STEP6:Provide what table structures you need to bring inside Talend. I have selected all the tables.

STEP7:If you need to edit the schema properties before bringing inside Talend then you can do it in the next step as below.

You can now view all the tables that we imported in the Talend as below.

Create a Oracle DB Connection and Closing the connection[inside a Job]

In this section, we will see how to use the components to create and close DB connections inside a job. Below two components are used to demonstrate.

  1. tDBConnection
  2. tDBClose

STEP1:Create a job and drag and drop tDBConnection and tDBClose. When you click on Connection Type, you will see Built-in or Repository type in the drop-down section, From the repository, you can use the connection that we have created above or you can also manually define the details using Built-in. I am using here Repository option.

STEP2:Click on component properties of tDBClose and select the component list as below

STEP3:Run the job and check the log.

Extract data from DB using a query

Data can be extracted from the DB table using a query by using tDBInput component. In this section we will see a job to do the above.

STEP1:Drag and drop the components tDBInput and tLogRow on to the Designer from the Palette. After that click on Database from the component section of tDBInput. You can see a list of databases that are available to connect. Select Oracle and click Apply.

STEP2:Click on Property type as Repository.

STEP3: You can see all the fileds are populated , In the same way click on Schema type as Repository and give Repository ,and then Click on Table name and select the table name .

STEP4: For Query Type also, we have the option to select Built-in or Repository , Sine I am going to write the query, I am going to give Built-in only. Click on Guess Query. Default query will be populated in the Query as below. I have modified the query using where clause and you can also run the query and check the result in the SQL builder itsef as below.

STEP5: Join the tDBInput with tLogRow and with Row Main and update the tLogRow properties to Table and execute the job.

Inserting/Updating Data in DB

In this section,you will learn how to insert/update the data using tDBOutput component. The test data that needs to be inserted/updated is generated using tRowGenerator component .

STEP1-Drag and drop tRowGenerator and tDBOutput on to the Designer window from Palette. Click on Rowgenerator Editor as below. The output structure of the table is also shown below.

STEP2-Specify the below functions, that will generate random data. You can also specify number of rows that needs to be generated.

STEP3-Specify the properties for tDBOutput similarly that we specified for tDBInput as earlier. We can also mention whether we need to Insert or Update the records. Run the job

Execute SQL queries on DB:

SQL queries can be executed on the DB by using tDBRow component. We will see this with a simple scenario as below.

We have an employee table as below and I will be increasing the salary of the department 30 with to 50%.

Table Before Execution:

STEP1-Drag and drop the tDBrow component and select the connection from the Repository and copy the following code into the Query window as below.

"MERGE INTO EMPLOYEE_TEST E
   USING (SELECT employee_id, salary, department_id FROM EMPLOYEE_TEST
   WHERE department_id = 30) S
   ON (E.employee_id = S.employee_id)
   WHEN MATCHED THEN UPDATE SET E.salary = E.salary + S.salary*.5"

STEP2-Run the job and check the output. You can see that salary has been increased to 50% for Department 30

This ends the overview of basic connectors in DB