Skip to content

Latest commit

 

History

History
54 lines (34 loc) · 2.1 KB

README.md

File metadata and controls

54 lines (34 loc) · 2.1 KB

This project is a conversion tool to Extract, Transform, and Load (ETL) MIAPA-compliant data from NeXML files into valid ISAtab format.

You can find details on the logical mapping between the formats in the links section.

Usage

From the project directory, run

./miapa-etl.sh 

The script has inline instructions. Requires a SAXON jar. I use SAXON 9.4 HE. You'll find 'saxon9he.jar` in SaxonHE9-4-0-4J.zip

Optionally, you can pass it arguments:

./miapa-etl.sh path/to/NeXML.file ./nexml-isatab.xsl path/to/your.SAXON.jar

The tool will create an output directory named after the input file and place the ISAtab files in this directory.

Examples

The examples directory contains NeXML files and corresponding ISAtab files, taken from Rutger Vos' supertreebase. You should be able to duplicate them using the tool. For instance, to transform the treebase file S10410.xml from the examples, type

./miapa-etl.sh nexmlex/S10410.xml ./nexml-isatab.xsl ~/git/working/SAXON9he.jar

Assuming ~/git/working/SAXON9he.jar actually points to your SAXON jar, this should produce the following files in your project directory:

S10410/a_S10410_1.txt
S10410/a_S10410_2.txt
S10410/i_S10410_1.txt
S10410/s_S10410_1.txt
S10410/s_S10410_2.txt
S10410/s_S10410_3.txt
S10410/s_S10410_4.txt

These should be identical to their counterparts in examples/S10410.

Latest Update

Currently (July 18 2012): The XSLT converts TreeBASE NeXML files into ISAtab files correctly. Work continues:

  • On a finalized ISA config for Phylogenetics

Links

To contribute, please fork and use pull requests.