GitHub - yaswanth-iitkgp/DepNeCTI: EMNLP 2023 Findings: DepNeCTI: Dependency-based Nested Compound Type Identification for Sanskrit

Official code for the paper DepNeCTI: Dependency-based Nested Compound Type Identification for Sanskrit

Requirements for running DepNeCTI-LSTM model

Python 3.7
cuda 11.7
torch 1.13.0
torchaudio 0.13.0
torchvision 0.14.0
And the rest of the dependencies can be installed by simply creating a new environment using the environment.yml file.

We assume that you have installed conda beforehand.

conda env create -f DepNeCTI-LSTM_environment.yml
conda env create -f DepNeCTI-XLMR_environment.yml

And then activate this environment and you are good to go now !!

Where is the dataset

Datasets are given in the Datasets folder.
Datasets include - with context (fine grain + coarse) and without context (fine grain + coarse)

How to generate the data format used in this model(DepNeCTI-LSTM)

transfer the .csv files from the respective Datasets folder to Datasets/data_format
Use the .ipynb file in the Datasets/data_format folder and follow the instructions mentioned there to generate the required data format.

Pretrained embeddings for DepNeCTI datasets

Pretrained FastText embeddings for DepNeCTI can be obtained from here.
Make sure that cc.NeCTIS.300.txt file is placed at data/. And place the rest of the files in word_vectors folder.
The main results are reported on the systems trained by combining train and dev splits.

If you want to get your own Pretrained embeddings for any dataset

First place all the files in the word_vectors folder as mentioned above.
Use the .ipynb file in word_vectors folder and generate your own fasttext embeddings.

How to train Proposed model for DepNeCTI-LSTM and DepNeCTI-XLMR

To run proposed system: simply run bash script run_DepNeCTI_LSTM.sh or run_DepNeCTI_XLMR.shand place the respective dataset similar to those files in the data. With these scripts you will be able to reproduce our results for proposed model reported in Table 2.
To run the system do this

bash run_DepNeCTI_LSTM.sh

How to Know F1, Precision, Recall scores

Use the script (eval_f1.py) provided in Evaluation folder to get the scores.

How to Know USS, LSS scores

Use the script (eval_USS_LSS.py) provided in Evaluation folder to get the scores.

How to reproduce the results shown for other baselines

Download the dataset from this link which are in the required format for each baseline.
Go to the respective folders in the Baselines folder and follow the readme files given there.
If you face problem in using this dataset from this link you can generate your data format using the data_format.ipynb in Datasets/data_format
Note: for using any baseline the data will have names like "genia", "GENIA" etc but that data is DepNeCTI data only, the names are left unchanged to avoid creating trouble when running the model.

Citing DepNeCTI

If you use DepNeCTI in your research, please consider citing our work:

@misc{sandhan2023depnecti,
      title={DepNeCTI: Dependency-based Nested Compound Type Identification for Sanskrit}, 
      author={Jivnesh Sandhan and Yaswanth Narsupalli and Sreevatsa Muppirala and Sriram Krishnan and Pavankumar Satuluri and Amba Kulkarni and Pawan Goyal},
      year={2023},
      eprint={2310.09501},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Requirements for running DepNeCTI-LSTM model

Where is the dataset

How to generate the data format used in this model(DepNeCTI-LSTM)

Pretrained embeddings for DepNeCTI datasets

If you want to get your own Pretrained embeddings for any dataset

How to train Proposed model for DepNeCTI-LSTM and DepNeCTI-XLMR

How to Know F1, Precision, Recall scores

How to Know USS, LSS scores

How to reproduce the results shown for other baselines

Citing DepNeCTI

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Baselines		Baselines
Datasets		Datasets
DepNeCTI-XLMR		DepNeCTI-XLMR
Evaluation		Evaluation
data		data
examples		examples
utils		utils
word_vectors		word_vectors
.gitattributes		.gitattributes
.gitignore		.gitignore
Baselines.zip		Baselines.zip
DepNeCTI-LSTM_environment.yml		DepNeCTI-LSTM_environment.yml
DepNeCTI-XLMR_environment.yml		DepNeCTI-XLMR_environment.yml
ReadMe.md		ReadMe.md
run_DepNeCTI_LSTM + Pretraining.sh		run_DepNeCTI_LSTM + Pretraining.sh
run_DepNeCTI_LSTM.sh		run_DepNeCTI_LSTM.sh
run_DepNeCTI_XLMR.sh		run_DepNeCTI_XLMR.sh

yaswanth-iitkgp/DepNeCTI

Folders and files

Latest commit

History

Repository files navigation

Requirements for running DepNeCTI-LSTM model

Where is the dataset

How to generate the data format used in this model(DepNeCTI-LSTM)

Pretrained embeddings for DepNeCTI datasets

If you want to get your own Pretrained embeddings for any dataset

How to train Proposed model for DepNeCTI-LSTM and DepNeCTI-XLMR

How to Know F1, Precision, Recall scores

How to Know USS, LSS scores

How to reproduce the results shown for other baselines

Citing DepNeCTI

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages