lower in the F1 score obtained by using the fine-tuned FLAN-T5-XL and FLAN-T5-large models #1

GuptaSonam · 2023-07-31T12:19:30Z

Hi,
Thank you for providing the fine-tuned models in the repository. I used the inference_alpaca.py code to evaluate the FLAN-T5-XL and FLAN-T5-large models on simulation dataset. However, the F1 score that I am getting are lower than what has been reported in the repository. Can you tell me if there is some setting that needs to be changed?

Following are the number that I am getting on running the inference:
FLAN-T5-large (reported) | 57.3. | 50.1 | 70.5
FLAN-T5-large (obtained) | 53. | 49 | 57

Thanks,
Sonam Gupta

xiangyue9607 · 2023-09-01T01:14:39Z

Hi Sonam,

Sorry for the late reply. Just saw the issue. It might be due to the mismatch of the prompt used for training and evaluation. Could you try the prompt in the following example?

"prompt": "As an Attribution Validator, your task is to verify whether a given context can support the given claim. A claim can be either a plain sentence or a question followed by its answer.Specifically, your response should clearly indicate the relationship: Attributable, Contradictory or Extrapolatory. A contradictory error occurs when you can infer that the answer contradicts the fact presented in the context, while an extrapolatory error means that you cannot infer the correctness of the answer based on the information provided in the context.\n\nClaim: "[Question] [Answer].\n\nContext: [Context]"

Let me know if this can replicate the result:)

Thanks,
Xiang

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lower in the F1 score obtained by using the fine-tuned FLAN-T5-XL and FLAN-T5-large models #1

lower in the F1 score obtained by using the fine-tuned FLAN-T5-XL and FLAN-T5-large models #1

GuptaSonam commented Jul 31, 2023

xiangyue9607 commented Sep 1, 2023

lower in the F1 score obtained by using the fine-tuned FLAN-T5-XL and FLAN-T5-large models #1

lower in the F1 score obtained by using the fine-tuned FLAN-T5-XL and FLAN-T5-large models #1

Comments

GuptaSonam commented Jul 31, 2023

xiangyue9607 commented Sep 1, 2023