This is the code used in training Yale's submission to the n2c2 2022 Challenge: Progress Note Understanding: Assessment and Plan Reasoning , achieving 2nd place model among 14 teams. The model was trained with huggingface transformers and PyTorch, using Yale High Performance Computing Cluster (YCRC) GPU resources.
Configuration files are all located under the conf/ directory.
Further investigation was done to explain model predictions using SHAPley values and the output html files are under explainability/ with the code to perform the analysis under Shap_Viz.ipynb. Due to their size and interactive nature, these will need to be downloaded and opened locally.
All relevant final training code can be found trainer_with_inference.py and gpt_inference.py for the GPT-2 model comparison.
The publication is currently Under Review at Journal of Biomedical Informatics (JBI) and the bibtex will be posted as soon as the paper is accepted.