-
Notifications
You must be signed in to change notification settings - Fork 6
module__org.bibliome.alvisnlp.modules.tika.TikaReader
Robert Bossy edited this page Jul 27, 2017
·
1 revision
#org.bibliome.alvisnlp.modules.tika.TikaReader
Reads PDF or DOC files and adds a document in the corpus for each file.
This module is experimental.
Optional
Type: SourceStream
Path to the source directory or source file.
Optional
Type: Mapping
UNDOCUMENTED
Optional
Type: Mapping
Constant features to add to each document created by this module
Optional
Type: Mapping
Constant features to add to each section created by this module
Default value: html
Type: String
Default value: text
Type: String
Name of the single section containing the whole contents of a file.
Default value: tag
Type: String