[Question] GPT zero/few-shot multiplechoice task evaluation implementation #6269

KTH1234 · 2023-03-22T05:21:18Z

KTH1234
Mar 22, 2023

Hello,

I noticed that in the NeMo GPT repository, the zero/few shot multiple-choice task evaluation scripts for RACE are evaluated using both "context + {choice candidate}" likelihood and "answer: {choice candidate}", as described in the GPT-3 paper. However, other similar tasks such as PiQA, BoolQ, and HellaSwag are only evaluated using "context + {choice candidate}".

I'm curious if there is a specific reason for this difference in evaluation method.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] GPT zero/few-shot multiplechoice task evaluation implementation #6269

{{title}}

Replies: 0 comments

Select a reply

[Question] GPT zero/few-shot multiplechoice task evaluation implementation #6269

KTH1234 Mar 22, 2023

Replies: 0 comments

KTH1234
Mar 22, 2023