This is the official implementation for the paper, "A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation".
The dependency packages can be found in requirements.txt
file. One can use pip install -r requirements.txt
to configure the environment. We use python 3.10 to run the experiments.
The overall pipeline is: build the belief tree via prompting the LLM OPENAI_API_KEY
environment variable.
- Belief tree generation
python generate_belief_tree.py --dataset=wikibio --backbone=chatgpt
Use python generate_belief_tree.py --helpfull
to see the choices for dataset and backbone.
By default, the generated belief trees will be stored at logs/belief_trees/{dataset}_{backbone}.json
- Prompt the LLM for its confidence score Similarly, you can specify the dataset name and the backbone LLM used for the experiment in the command line:
python confidence_estimation.py --dataset=wikibio --backbone=chatgpt
By default, the generated belief trees will be stored at logs/conf_estimation/{dataset}_{backbone}.json
- Use the NLI model to label the edge type (the relationship between a parent node and a child node)
python tools/label_edges.py --dataset=wikibio --backbone=chatgpt
- Compute the posterior probabilities
python hmm_forward.py --dataset=wikibio --backbone=chatgpt
- Performance evaluation
python tools/compute_metrics.py --dataset=wikibio --backbone=chatgpt
@article{hou2024probabilistic,
title={A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation},
author={Hou, Bairu and Zhang, Yang and Andreas, Jacob and Chang, Shiyu},
journal={arXiv preprint arXiv:2406.06950},
year={2024}
}```