Skip to content
/ CDM Public

The code and data for "Understanding Jargon: Combining Extraction and Generation for Definition Modeling" (EMNLP '22)

License

Notifications You must be signed in to change notification settings

jeffhj/CDM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

README

The code and data for "Understanding Jargon: Combining Extraction and Generation for Definition Modeling" (EMNLP '22)

Introduction

We propose to combine extraction and generation for jargon definition modeling: first extract self- and correlative definitional information of target jargon from the Web and then generate the final definitions by incorporating the extracted definitional information. Our framework is remarkably simple but effective: experiments demonstrate our method can generate high-quality definitions for jargon and outperform state-of-the-art models significantly, e.g., BLEU score from 8.76 to 22.66 and human-annotated score from 2.34 to 4.04.

image

Usage

Please refer to the detailed README.md in ./extraction/ and ./generation/

Data

Data can be downloaded from Google Drive

Generated definitions

Stored in ./sample/generated_definition_for_cs_term.txt

Citation

The details of this repo are described in the following paper. If you find this repo useful, please kindly cite it:

@inproceedings{huang2022understanding,
  title={Understanding Jargon: Combining Extraction and Generation for Definition Modeling},
  author={Huang, Jie and Shao, Hanyin and Chang, Kevin Chen-Chuan and Xiong, Jinjun and Hwu, Wen-mei},
  booktitle={Proceedings of EMNLP},
  year={2022}
}

About

The code and data for "Understanding Jargon: Combining Extraction and Generation for Definition Modeling" (EMNLP '22)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published