Skip to content

Latest commit

 

History

History
79 lines (59 loc) · 2.85 KB

README.org

File metadata and controls

79 lines (59 loc) · 2.85 KB

tei2slob

This is a tool to convert TEI P5 dictionaries to slob format. Some free TEI P5 dictionaries are available at http://freedict.org/

Installation

Create Python 3 virtual environment and install slob.py as described at http://github.com/itkach/slob/.

In this virtual environment run

pip install git+https://github.com/itkach/tei2slob.git

Usage

Download a dictionary archive and unpack it. For example:

wget http://downloads.sourceforge.net/project/freedict/English%20-%20German/0.3.6/freedict-eng-deu-0.3.6.src.tar.bz2
tar -xvf freedict-eng-deu-0.3.6.src.tar.bz2

Then run converter:

tei2slob eng-deu/eng-deu.tei

eng-deu-0.3.6.slob will be created in the same directory.

Converter attempts to populate dictionary tags based on information in .tei header section, but it may fail because the way some elements (like license name) is not standardized and varies across dictionaries, so be sure to check the tags:

slob info eng-deu-0.3.6.slob

Set tag values as necessary, for example:

slob tag -n license.name -v "GNU General Public License" eng-deu-0.3.6.slob
slob tag -n license.url -v "http://www.gnu.org/licenses/gpl.html" eng-deu-0.3.6.slob
slob tag -n created.by -v me@example.com eng-deu-0.3.6.slob

uri is an important tag. When different dictionaries have the same uri it means they contain keys belonging to the same logical dictionary. So when compiling a new version of existing dictionary make sure uri remains the same.

usage: tei2slob [-h] [-o OUTPUT_FILE] [-c {lzma2,zlib}] [-b BIN_SIZE]
                [-a CREATED_BY] [-w WORK_DIR]
                input_file

positional arguments:
  input_file            TEI file name

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT_FILE, --output-file OUTPUT_FILE
                        Name of output slob file
  -c {lzma2,zlib}, --compression {lzma2,zlib}
                        Name of compression to use. Default: zlib
  -b BIN_SIZE, --bin-size BIN_SIZE
                        Minimum storage bin size in kilobytes. Default: 256
  -a CREATED_BY, --created-by CREATED_BY
                        Value for created.by tag. Identifier (e.g. name or
                        email) for slob file creator
  -w WORK_DIR, --work-dir WORK_DIR
                        Directory for temporary files created during
                        compilation. Default: .