You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Thank you for providing the fine-tuned models in the repository. I used the inference_alpaca.py code to evaluate the FLAN-T5-XL and FLAN-T5-large models on simulation dataset. However, the F1 score that I am getting are lower than what has been reported in the repository. Can you tell me if there is some setting that needs to be changed?
Following are the number that I am getting on running the inference:
FLAN-T5-large (reported) | 57.3. | 50.1 | 70.5
FLAN-T5-large (obtained) | 53. | 49 | 57
Thanks,
Sonam Gupta
The text was updated successfully, but these errors were encountered:
Sorry for the late reply. Just saw the issue. It might be due to the mismatch of the prompt used for training and evaluation. Could you try the prompt in the following example?
"prompt": "As an Attribution Validator, your task is to verify whether a given context can support the given claim. A claim can be either a plain sentence or a question followed by its answer.Specifically, your response should clearly indicate the relationship: Attributable, Contradictory or Extrapolatory. A contradictory error occurs when you can infer that the answer contradicts the fact presented in the context, while an extrapolatory error means that you cannot infer the correctness of the answer based on the information provided in the context.\n\nClaim: "[Question] [Answer].\n\nContext: [Context]"
Hi,
Thank you for providing the fine-tuned models in the repository. I used the inference_alpaca.py code to evaluate the FLAN-T5-XL and FLAN-T5-large models on simulation dataset. However, the F1 score that I am getting are lower than what has been reported in the repository. Can you tell me if there is some setting that needs to be changed?
Following are the number that I am getting on running the inference:
FLAN-T5-large (reported) | 57.3. | 50.1 | 70.5
FLAN-T5-large (obtained) | 53. | 49 | 57
Thanks,
Sonam Gupta
The text was updated successfully, but these errors were encountered: