Cross Lingual Tool is a Python framework for Cross-Lingual Transfer Learning experiments on Named Entity Recognition and Relation Extraction tasks.
Each experiment consists of 3 steps:
- Train model on source data (Model I)
- Fine-tune Model I on target data
- Train model on target data (Model II)
After each step trained models along with classification reports are saved to the predefined folder.
git clone https://github.com/apugachev/CrossLingualTool.git
python3 -m venv cross-env
source cross-env/bin/activate
cd crosslingualtool
pip3 install -r requirements.txt
Source and Target folders are required to have train.json, dev.json and test.json files along with entity_mapping.json file.
Examples of files for both tasks are provided here.
python3 run_ner.py \
--source_data_path ../source_data/ \
--target_data_path ../target_data/ \
--entity_mapping_path ../entity_mapping.json \
--save_folder ../save_folder/ \
--pretrained_path bert-base-multilingual-uncased \
--batch_size 64 \
--lr 0.00001 \
--max_length 128 \
--max_epoch 20 \
--f1_avg macro \
--early_stopping loss \
--patience 1 \
--device cpu
python3 run_ner.py \
--source_data_path ../source_data/ \
--target_data_path ../target_data/ \
--entity_mapping_path ../entity_mapping.json \
--save_folder ../save_folder/ \
--pretrained_path bert-base-multilingual-uncased \
--batch_size 64 \
--lr 0.00001 \
--max_length 128 \
--max_epoch 20 \
--f1_avg macro \
--early_stopping loss \
--patience 1 \
--device cpu