-
Notifications
You must be signed in to change notification settings - Fork 1
Tutorial Talend Component Kit 3: Building an Input Component (Dataset, Datastore, Partition Mapper & Source)
This part of the tutorial explains the structure of an input component of the "Talend Component Kit ". The focus here is primarily on the structure and context of the individual parts of the component and not the implementation. This is discussed in Part 4 and 5 of this tutorial.
After the third part you know:
- the task of a Dataset and Datastores, as well as its interrelations
- how a Partition Mapper works
- the properties of a Source
As shown in the graphic, the datastore is part of the dataset, and the dataset is part of the input or output component. The task of the dataset and datastore is to provide the information where, how and which data is to be processed in the component.
The division of the area of responsibility between dataset and datastore is uniquely defined.
- Datastore: Contains the data required for the connection with the Backend required are
- Dataset: Contains the datastore and the data required for the Processing of the data are necessary
The class of the "Partition Mapper " must implement three methods, each of which must be marked with the corresponding annotation:
- Assessor
- Split
- Emitter
The idea behind the "Partition Mappers " is to first estimate the effort of data processing and to break it down into parts before execution to allow more efficient execution. In In the case of simple queries, such as a query from a RESTful API like Jira, this division is not necessary and the "Patition Mapper " is created only once.
The assessor's task is to estimate into how many parts the task is ideally divided into. This number must in fact only be estimated and not exact.
In the case of the Jira component, the assessor returns 1, since it does not make sense to split an HTTPS query.
The split method is used to split the tasks of the mapper. This means that it returns a list of partition mappers that only have a part of the tasks to complete.
The split method of the Jira component returns itself in the form of a single-tone list, since it will remain the only partition mapper.
The emitter is responsible for finally instantiating the source class and executing its producer method. It does not receive any data, but uses the configuration, which is defined as attribute in the classroom.
The emitter should return the instantiated class as the result.
The producer method in the source class takes over the actual solution of the task. The result is returned in a "Record ". This can be created using the "RecordBuilderFactory ".
The method is obviously called by each emitter, so it must return different results. In the case that the processing of the task is completed, a "zero " value is returned. and we delivered it back to you.
-
Tutorial - Talend Component Kit #1: Setting Up the Development Environment
-
Tutorial - Talend Component Kit #2: Testing your own component
-
Tutorial - Talend Component Kit #4: Implementation of the Dataset and -store of a Jira-Input)
-
Tutorial - Talend Component Kit #6: Implementing "List" Input and further Output-Processing
-
Tutorial - Talend Component Kit #7: Implement "Advanced Settings" of a Jira input component