-
Notifications
You must be signed in to change notification settings - Fork 6
module__WebOfKnowledgeReader
#org.bibliome.alvisnlp.modules.wok.WebOfKnowledgeReader
Reads Web of Knowledge search result import files.
WARNING: WoK delivers files with a wrong Byte Order Mark, it is advised you remove it using a text editor before feeding it to org.bibliome.alvisnlp.modules.wok.WebOfKnowledgeReader.
The PT field (Publication Type) is used as a document marker, org.bibliome.alvisnlp.modules.wok.WebOfKnowledgeReader will create a document each time it reads a PT field.
The VR field will be read and, if its value is different from "1.0", then org.bibliome.alvisnlp.modules.wok.WebOfKnowledgeReader fails.
The following fields will be read and stored as document features, one feature per line: AU, AF, BA, BF, CA, GP, BE, SO, SE, BS, LA, CT, CY, CL, SP, HO, C1, RP, EM, RI, OI, FU, CR, TC, Z9, PU, PI, PA, SN, BN, J9, JI, PD, PY, VL, IS, PN, SU, MA, BP, EP, AR, DI, D2, PG, P2, GA, UT, SI, NR.
The following fields will be read and stored as document features, several features per line split with semicolons: DE, DT, ID, WC, SC.
The following fields will be read and stored as sections, all lines concatenated for the contents: TI, AB, FX.
The following fields will be ignored: ER, EF, FN.
The feature and section names are the 2-character field code. For an interpretation of field codes, see WoK format documentation.
Optional
Type: SourceStream
Location of the WoK file(s).
Optional
Type: Mapping
Constant features to add to each document created by this module.
Optional
Type: Mapping
Constant features to add to each section created by this module.
Default value: false
Type: Boolean
Read files in tabular export format.