Source code for ACL 2020 paper "Rationalizing Medical Relation Prediction from Corpus-level Statistics".
Generating explanations for the decision made by advanced machine learning models, such as neural networks, has drawn extensive attention recently, and is especially important in those high-stake domains such as medicine, finance, and the judiciary. Reasonable explanations could help debug themodel, detect model bias, and more importantly, earn user trust for practical applications.
In this paper, we propose to explain medical relation prediction based on existing cognitive theories about human memory recall and recognition. Our intuition is that, to explain the relationships between two entities, we humans tend to resort to the connections between their contexts. An toy example to illustrate our intuition is shown above. For example, to predict why “Aspirin” may treat “Headache”, a model could first recall a relevant entity “PainRelief” for “Headache” as they co-occur frequently, and then recognize there is a chance that“Aspirin” can lead to “Pain Relief”, based on which it could finally make a correct prediction ( Aspirinmay treat Headache).
Inspired by such cognitive processes, we build a graph-based framework to rationalize medical relationprediction based on corpus-level statistics. The task is to predict the relations between two medicalterms. Its workflow is shown above. The framework consists of three cognitive stages: association recall, assumption recognition,and decision making, which can be easily understood by end users and generate reasonable rationalesto justify the model prediction. Essentially, our framework leverages corpus-level statistics to recallassociative contexts of target entities and recognizes their relational connections as model rationales. We show its competitive predictive performance compared with a comprehensive list of black-box neural models and demonstrate the quality of model rationales via expert evaluations.
You can download the data from the following links: Corpus-level Statistics, Labeled Relation Data, Relation List, Relation Triples.
To train the model, simply run the following scrips:
> bash ./src/bash.sh
After the traning, you can infer the rationales that are important for the model prediction:
> bash ./src/bash_infer.sh
If you have any questions, please feel free to contact us! Also, feel free to check other tools in our group (https://github.com/sunlab-osu) 😊
@inproceedings{wang2020rationalizing,
title={Rationalizing Medical Relation Prediction from Corpus-level Statistics},
author={Wang, Zhen and Lee, Jennifer and Lin, Simon and Sun, Huan},
booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
year={2020}
}