-
Notifications
You must be signed in to change notification settings - Fork 199
filters
Mahmoud Ben Hassine edited this page Feb 7, 2020
·
3 revisions
You can filter records using a RecordFilter
. This interface allows you to skip next stages of the pipeline if the record satisfies a given predicate.
Typical examples are:
- filter comment records (those beginning with # for example) in a flat file
- filter log files (with extension .log) when processing a set of files in a directory
- etc
To register a record filter, you can use the JobBuilder
API as follows:
Job job = new JobBuilder()
.filter(new myRecordFilter())
.build();
You can register as many filters as you want anywhere in the pipeline. Next stages of the pipeline will be skipped for each filtered record. There are several built-in implementations for commonly used filters:
Filter | Record type | Module | Description |
---|---|---|---|
EmptyStringRecordFilter | StringRecord | easy-batch-core | Filter String records with empty payload |
StartsWithStringRecordFilter | StringRecord | easy-batch-core | Filter String records starting with a given prefix |
EndsWithStringRecordFilter | StringRecord | easy-batch-core | Filter String records ending with a given suffix |
GrepFilter | StringRecord | easy-batch-core | Keep String records containing the given pattern |
HeaderRecordFilter | Record | easy-batch-core | Filter the header record (first record in the data source) |
FilteredRecordsCollector | Record | easy-batch-core | Saves filtered records for later use |
FileExtensionFilter | FileRecord | easy-batch-core | Filter File records having a file name ending with a given extension |
Easy Batch is created by Mahmoud Ben Hassine with the help of some awesome contributors
-
Introduction
-
User guide
-
Job reference
-
Component reference
-
Get involved