GitHub - Shreyasi2002/NLI4CT: Safe Biomedical Natural Language Inference for Clinical Trials based on SemEval 2024 Task 2

IITK at SemEval-2024 Task 2: Exploring the Capabilities of LLMs for Safe Biomedical Natural Language Inference for Clinical Trials

Official code implementation

View Paper · Report Bug · Request Feature

Table of Contents

About
Usage Instructions
Results
Citation

About

Large Language models (LLMs) have demonstrated state-of-the-art performance in various natural language processing (NLP) tasks across multiple domains, yet they are prone to shortcut learning and factual inconsistencies. This research investigates LLMs' robustness, consistency, and faithful reasoning when performing Natural Language Inference (NLI) on breast cancer Clinical Trial Reports (CTRs) in the context of SemEval 2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials. We examine the reasoning capabilities of LLMs and their adeptness at logical problem-solving. A comparative analysis is conducted on pre-trained language models (PLMs), GPT-3.5, and Gemini Pro under zero-shot settings using Retrieval-Augmented Generation (RAG) framework, integrating various reasoning chains.

Usage Instructions

Project Structure

📂 NLI4CT
|_📁 Gemini                   
  |_📄 run-gemini-chain.py   # Multi-turn conversation using Gemini Pro
  |_📄 prep_results.py       # Converting the labels to Entailment/Contradiction
  |_📄 Gemini_results.json   # Output of Gemini Pro - explanations and labels
  |_📄 results.json          # Final labels
|_📁 GPT-3.5                 # Experimentation with GPT-3.5
  |_📄 GPT3.5.py
  |_📄 ChatGPT_results.json
|_📁 training-data           # Training data - Clinical Trial Reports (CTRs)
|_📁 Experiments             # Experimentation with other models - Flan T5 and Pre-trained Language Models (PLMs)
  |_📄 flant5-label.ipynb
  |_📄 PLMs.ipynb
|_📄 Makefile                # Creating conda environment and installing dependencies
|_📄 LICENSE
|_📄 requirements.txt  
|_📄 .gitignore

Install dependencies

Run the following command -

make

This will create a new anaconda environment and install the required dependencies. In case you do not use anaconda, run the following command to install the dependencies.

pip install -r requirements.txt

Get API Keys

Create a .env file in the main directory. Fetch the API Keys for GPT-3.5 and Gemini Pro and put them in the .env file as follows -

GOOGLE_API_KEY = "..."
OPENAI_API_KEY = "..."

Run Gemini Pro

Run the multi-turn conversation chain using the following command -

python run-gemini-chain.py

Gemini Pro will generate an explanation and a label (Yes/No) for each statement in the dataset.

Results

The zero-shot evaluation of Gemini Pro yielded an F1 score of 0.69, with a consistency of 0.71 and a faithfulness score of 0.90 on the official test dataset. Our system achieved a fifth-place ranking based on the faithfulness score, a sixteenth-place ranking based on the consistency score, and a twenty-first-place ranking based on the F1 score. Gemini Pro outperforms GPT-3.5 with an improvement in F1 score by +1.9%, while maintaining almost similar consistency score. Additionally, the faithfulness score of Gemini Pro improves by +3.5% compared to GPT-3.5.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IITK at SemEval-2024 Task 2: Exploring the Capabilities of LLMs for Safe Biomedical Natural Language Inference for Clinical Trials

About

Usage Instructions

Project Structure

Install dependencies

Get API Keys

Run Gemini Pro

Results

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
Experiments		Experiments
GPT-3.5		GPT-3.5
Gemini		Gemini
training-data		training-data
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt

License

Shreyasi2002/NLI4CT

Folders and files

Latest commit

History

Repository files navigation

IITK at SemEval-2024 Task 2: Exploring the Capabilities of LLMs for Safe Biomedical Natural Language Inference for Clinical Trials

About

Usage Instructions

Project Structure

Install dependencies

Get API Keys

Run Gemini Pro

Results

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages