7.Talend Data Integration Orchestration Operations-Overview

Talend Orchestration Operations helps in performing operations like loops,Waiting for a job , Iterations,triggering of tasks before and after jobs,etc.

Iterating on a list of values[tForLoop]

tForLoop component is like for loop based on the list of values.

STEP1: Drag and drop tForeach and tJava components from the palette. Join them using Iterate link as shown below.

STEP2:Enter the values in the basic settings of tForeach Component as below

STEP3:Click on tJava Component and enter the command as shown below to print the current value of the tForeach component.

System.out.println(((String)globalMap.get("tForeach_1_CURRENT_VALUE")));

STEP4:Run the job, It ran for 3 times and at each iteration , the current value has been printed.

Iterates a set of files based on a file mask[tFileList]

Let us say, we have a set of files in a folder and we need to read each file content and it has to be displayed. Notice that all the input files are having the same structure. We can achieve this using tFileList component as shown below.

Input Files are shown as below

STEP1: Drag and drop tFilelist,tFileInputDelimited , Join them using Iterate Link ,Drag tLogRow component from the palette and Join it using Iterate link as shown below.

STEP2:Edit tFileList Component Basic Settings, including Directory path and FileMask as below.

STEP3:Edit tInputFileDelimited Component and enter the below code in the File name/Stream as below,Ensure the Header is also set to 1 as we have 1 line of header in the Input file. Also set the schema properties as well. Ensure the schema is passed to tLogRow as well.

((String)globalMap.get("tFileList_1_CURRENT_FILEPATH"))

STEP4:Edit tLogRow and enable Table view as below.

STEP5:Run the job and check the output.

Transforming a list of files as data flow[tIteratetoFlow]

tITeratetoFlow component will iterate all the inputs at once and then pass these to the next flow, Let us see this with the below scenario. Read the files in a folder and display all the names of the files along with the dates that were read.

STEP1:Drag and Drop tFileList,tIteratetoFlow and tLogRow onto the palette. Join the tFileList and tIteratetoFlow with Iterate and join the tIteratetoFlow with tLogRow with Main.

STEP2:Edit the tFilelist,tIteratetoFlow and tLogRow properties as below

((String)globalMap.get("tFileList_1_CURRENT_FILEPATH"))
TalendDate.getDate("DD-MM-CCYY hh:mm:ss")

STEP3:Run the job and check the output. We can see that all the files displayed the same date with time.

Transforming data flow to a list[tFlowToIterate]

In this scenario, tFlowtoIterate will read the filenames that are stored in a file , and process each file one by one as shown below. Let us take a look at the files below that are used in the below scenario.

STEP1:Drag and Drop tFileInputDelimited,tFlowtoIterate,tFileInputDelimited and tLogRow onto the palette. Join the tFileInputDelimited_1 and tFlowtoIterate with Main and join the tFlowtoIterate with tFileInputDelimited_2 with Iterarate and join the tFileInputDelimited_2 with tLogRow with Main.

STEP2:Edit the tFileInputDelimited_1,tFlowtoIterate,tFileInputDelimited_2 and tLogRow properties as below. Note that there are no changes done to tFlowtoIterate.

STEP3:Run the job and check the output. We can see that all the files displayed the same date with time.

Iterating on files and Merging the file content[tUnite]

In this scenario , we will read each file and combine all the file content and display it.

Input file structure is given as below

STEP1:Drag and Drop tFileList,tFileInputDelimited,tUnite and tLogRow onto the palette. Join the tFileList and tFileInputDelimited with Iterate and join the tFileInputDelimited and tUnite with Merge and tUnite with tLogRow with Main as shown below.

STEP2:Edit the tFileList,tFileInputDelimited_1,tUnite_1and tLogRow properties as below.

STEP3:Run the job and check the output. We can see that all the files has been combined.

Execution of job multiple times[tLoop]

This example will introduce tLoop,tChild,tSleep. tChild is used to run a child job , tLoop is usefule when we are running a loop, tsleep pauses for ceertain time .

First let us see how the child job is designed as below.

Child job is used to generate random rows and print them along with the timestamp as below.

Child Job :

STEP1:Drag and drop tRowgenerator,tLogRow and tJava as shown below.Connect tRowgenerator and tLogRow with Main and connect tLogRow with tJava with Main.

STEP2:Edit the tRowgenerator,tLogRow,tJava_1 properties as below. Note that we have limited the rows to 5 in tRowGenerator.

Parent Job :

STEP1:Now create a new job with the name tLoop and Drag and drop tLoop,tSleep components from the palette. Drag and drop the tChild job that we developed above into the Palette and join the components as below.

STEP2:Edit the properties of tLoop,tSleep as below. Note that tChild does not have any changes.

STEP3:Save this job and run this job and check the output. Note that the child job is called in the execution and also the sleep step has executed for 10 secs.

Triggering a task before a job[tPrejob] and after a job[tPostjob]

tPrejob is used to trigger the steps that will execute before running the main Flow,Similarly, tPostjob is used to trigger the steps after the flow.

Here we will use tPrejob to take the backup of the file that will be modified in the Main flow and tPostjob is used to send an email after the main flow is completed.

STEP1:Drag and drop tPrejob,tPostjob,tFilecopy,tSendmail,tRowgenerator,tFileOuputDelimited as shown below.Connect tPrejob to tFilecopy and tPostjob to tSendMail using OnComponentOK. Connect tRowgenerator to tFileOuputDelimited using Row Main.

STEP2:Edit the properties of tRowgenerator,tFileOuputDelimited ,tFileCopy,tSendMail as below.

STEP3:Save this job and run this job and check the output.Note that backup of the file has been taken and mail also has been sent.

This completes the tutorial of Orchestration components

Leave a Comment