Tutorial Talend Component Kit 3: Building an Input Component (Dataset, Datastore, Partition Mapper & Source)

This part of the tutorial explains the structure of an input component of the "Talend Component Kit ". The focus here is primarily on the structure and context of the individual parts of the component and not the implementation. This is discussed in Part 4 and 5 of this tutorial.

After the third part you know:

the task of a Dataset and Datastores, as well as its interrelations
how a Partition Mapper works
the properties of a Source

1. Dataset & Datastore

As shown in the graphic, the datastore is part of the dataset, and the dataset is part of the input or output component. The task of the dataset and datastore is to provide the information where, how and which data is to be processed in the component.

The division of the area of responsibility between dataset and datastore is uniquely defined.

Datastore: Contains the data required for the connection with the Backend required are
Dataset: Contains the datastore and the data required for the Processing of the data are necessary

2. Partition Mapper

The class of the "Partition Mapper " must implement three methods, each of which must be marked with the corresponding annotation:

Assessor
Split
Emitter

The idea behind the "Partition Mappers " is to first estimate the effort of data processing and to break it down into parts before execution to allow more efficient execution. In In the case of simple queries, such as a query from a RESTful API like Jira, this division is not necessary and the "Patition Mapper " is created only once.

2.1 Assessor

The assessor's task is to estimate into how many parts the task is ideally divided into. This number must in fact only be estimated and not exact.

In the case of the Jira component, the assessor returns 1, since it does not make sense to split an HTTPS query.

2.2 Split

The split method is used to split the tasks of the mapper. This means that it returns a list of partition mappers that only have a part of the tasks to complete.

The split method of the Jira component returns itself in the form of a single-tone list, since it will remain the only partition mapper.

2.3 Emitter

The emitter is responsible for finally instantiating the source class and executing its producer method. It does not receive any data, but uses the configuration, which is defined as attribute in the classroom.

The emitter should return the instantiated class as the result.

3. Producer-Method in the Source Class

The producer method in the source class takes over the actual solution of the task. The result is returned in a "Record ". This can be created using the "RecordBuilderFactory ".

The method is obviously called by each emitter, so it must return different results. In the case that the processing of the task is completed, a "zero " value is returned. and we delivered it back to you.

Overview over the whole Tutorial:

Tutorial - Talend Component Kit #1: Setting Up the Development Environment
Tutorial - Talend Component Kit #2: Testing your own component
Tutorial - Talend Component Kit #3: Building an Input Component (Dataset, Datastore, Partition Mapper & Source)
Tutorial - Talend Component Kit #4: Implementation of the Dataset and -store of a Jira-Input)
Tutorial - Talend Component Kit #5: Implementation of the Partition Mapper and the Source of a Jira Input
Tutorial - Talend Component Kit #6: Implementing "List" Input and further Output-Processing
Tutorial - Talend Component Kit #7: Implement "Advanced Settings" of a Jira input component

Provide feedback

Saved searches

Use saved searches to filter your results more quickly