Step 1: Create the ETL pipeline
The first step is to create an ETL pipeline.
For this exercise, use a CSV file with the following fields: Customer_ID, Join_Date

Open Filezilla (or any FTP/SFTP) utility tool and connect to the remote site.
Place the CSV file in the connecting remote site.
To go to the ETL pipeline template, click on βETL pipelineβ under the βData managementβ menu on the left side panel.

To create a new ETL pipeline, click the [+ ETL pipeline] button under the βETL pipelineβ tab.

To label the ETL pipeline and execution log entity, add the following information:
Enter the βETL pipelineβ name as βUserName_Acc_customersβ.
Optionally add a description for this ETL pipeline.
Enter a name for the execution log for this ETL pipeline as βcsvtoentityβ. This log will be updated each time the ETL pipeline is executed and can be viewed by clicking βGo to recordsβ.

To connect to the data source, add the necessary information in the βData Sourceβ function. To add an existing connector:
Select the βConnectorβ tab.
Select the relevant connector source. In this exercise, βTutorial_SFTPβ is selected. (The source can be selected from the drop-down menu or a new source can be added by clicking the [+] button and adding relevant details.)
To edit the settings, click the βEdit the settingsβ arrow. In the settings, to add or update the details for the FTP/SFTP Connector, update the required fields (click here for the steps). If a connector does not exist, to create a new connector In the Data Source in the ETL pipeline, click the [+] button and add the required fields.

To add the necessary details to connect with the data destination, in the βData Destinationβ section:
Select the βEntityβ tab.
Select destination as βUserName_Acc_skipCustomersβ.

To disallow multiple simultaneous runs of the ETL pipeline leave the toggle button enabled for βskip execution while in progressβ. Enabling this toggle button defines that the execution of this ETL pipeline will be skipped when there is one already in progress.

The default value for ETL pipeline execution is selected as βImmediateβ. For this exercise, keep it as is.

To define the shared variables, go to the βVariablesβ tab. The βVariablesβ tab consists of the shared variable βStartupParametersβ with the data type as βListβ. To add the variables for Actions to use, create the following additional three shared variables by clicking the [+ Variable] button.

These variables will cover the scope of all the sections in the ETL pipeline.
To store the the value for the joining month, define the following variable: Variable name: βdefJoinMonthβ, Data Type: βintegerβ, Initialized value: β3β

To store the the value for the joining day, define the following variable: Variable name: βdefJoinDayβ, Data Type: integer, Not initialized

To store the the value for the complete joining date, define the following variable: Variable name: βjoinedDateβ, Data Type: DateTime, Not initialized

To define the Reader (reading from the source) and Writer (writing on the destination) settings, go to the βData Formatβ tab. To define the stream, select βReader Streamβ as βCSV Streamβ.

To define the settings for the βCSV Streamβ, click on the βEdit the settingsβ arrow to add details for the βReader streamβ.

βFTP CSV Format Detailsβ box will be displayed. To define the βFTP CSV format detailsβ settings, add the following details:
Add the exact file path. This is the path to the file from which the CSV data will be read.
Select βCharacter Setβ as βUniversal (UTF-8)β.
Select βLanguageβ as βEnglishβ.
To define that the data is read from the first row, in the βStart reading CSV from lineβ field, enter the digit β1β. The field βHeader present in the first rowβ is selected by default.
To define the separator based on which the fields are distinguished, select Separator as βCommaβ.
The [Sample Data] button gets activated. To define the sample entity fields, click the [Sample Data] button.

When the [Sample Data]button is clicked, a box will be displayed to add the header file sample that is in the CSV file. Add the header file sample as follows:
Customer_ID,Join_DateThe header sample will contain the details of the fields to be read from the target Entity. To save the sample, click the [OK] button.

The fields of the entity will be displayed (based on the sample data entered).

To copy the CSV fields to Source fields in the entity, click the activated [Copy Sample to Source Fields] button.

The fields will be populated with string data types in the Source Fields section.

Update the data type for the βJoin_Dateβ field to DateTime (as per the field's data type in the destination entity).

To accept the Ftp CSV format details settings, click the [Accept & Collapse] button.

After the βReaderβ settings are defined, to define βWriterβ settings, select the Data Format>Writer tab. To define that the data is written to the destination as mentioned in reader source fields without transforming it, select βWriter Modeβ as βAppendβ.

To map the reader source fields to the entity destination fields, select the βField Mappingβ tab. To add the Mapped Fields, click the [+ Field] button and add relevant information.

In case of CSV files, the sequence of fields top-down must be exactly the same as the sequence specified in the Reader.
Mapping Sources: Select the field specified in the Reader. Select Variables>Reader>(field name). It displays as βreader.(Field Name)β.
Select field from Entity: Select mapping field from the Entity.
Data Type: Select the data type for the mapping fields.

Add both the required fields in the βField Mappingβ section.
The Customer_ID field contains the odd numbers from 1 to 17 and is mapped to the target entity field βCustomer_IDβ.
The βJoin_Dateβ contains only the value for the day in the date and is mapped to a Shared variable βjoinedDateβ of data type DateTime that is not initialized. (When a DateTime variable is not initialized to any value, it picks up the default date of β1970β01-01 00:00:00β.)

To add a target entity, go to the Target Entities tab and click [+ Target entity].

Select the entity βcsvtoentityβ.

Last updated
