This repository contains code and data for the paper: Piecing It All Together: Verifying Multi-Hop Multimodal Claims in COLING 2025.
Nov 13, 2024
Initial release.
In the Dataset folder you will find the files for MMCV:
Each line of the MMCV files (e.g. 1hop.json
) contains one multi-hop claim, alongside its multimodal evidence.
{
"claim": "Each Cadet is equipped with a tool in their right hand, much like a coffeehouse serves a variety of beverages to its patrons.",
"wiki_context": "A coffeehouse, coffee shop, or café is an establishment that serves various types of coffee, espresso, latte, americano and cappuccino. Some coffeehouses may serve cold beverages, such as iced coffee and iced tea, as well as other non-caffeinated beverages. A coffeehouse may also serve food, such as light snacks, sandwiches, muffins, cakes, breads, donuts or pastries. In continental Europe, some cafés also serve alcoholic beverages. Coffeehouses range from owner-operated small businesses to large multinational corporations.",
"text_evidence": [],
"image_evidence": [
"50ccc7ab2db82b7feeaa3bbf6f533773"
],
"table_evidence": [],
"label": "SUPPORT"
}
The ids for text_evidence
, image_evidence
, and table_evidence
corresponds to ids in MMQA. Please run python download_raw.py
and sh download_images.sh
in Setup to download the raw files.
Step1: Please create .env file and set your API key:
OPENAI_API_KEY="YOUR KEY"
GEMINI_API_KEY="YOUR KEY"
Step2: This script will create and download all raw data to directory called MMQA_Raw
.
python download_raw.py
Step3: Run the following script to download raw image files from MMQA. Then, please unzip it and put it under MMQA_Raw
. The path will be MMQA_Raw/final_dataset_images
.
sh download_images.sh
Requires Python 3.9 to run.
Install conda environment from environment.yml
file.
conda env create -n mmcv --file environment.yml
conda activate mmcv
To run claim generationa and refinement:
python data_collection_pipeline.py
python assemble.py
To run the negation pipeline:
python negation_pipeline.py
To run MLLM experiments:
python mllm_exp.py
python evaluation.py
All experiment results can be found in the MLLM_Results folder.
@inproceedings{wang2024piecing,
title={Piecing It All Together: Verifying Multi-Hop Multimodal Claims},
author={Haoran Wang and Aman Rangapur and Xiongxiao Xu and Yueqing Liang and Haroon Gharwi and Carl Yang and Kai Shu.},
booktitle={Proceedings of the 31st International Conference on Computational Linguistics},
year={2025}
}
The MMCV dataset is distribued under the CC BY-SA 4.0 license.