Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add eval yaml for Theory of Mind eval #1453

Merged
merged 1 commit into from
Jan 9, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions evals/registry/evals/theory_of_mind.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
theory_of_mind:
id: theory_of_mind.tomi
metrics: [accuracy]
description: Runs a series of theory of mind (ToM) benchmarks (ToMI, SocialIQA).

theory_of_mind.tomi:
class: evals.elsuite.basic.match_with_solvers:MatchWithSolvers
args:
samples_jsonl: theory_of_mind/tomi/test.jsonl
task_description: "You will read a number of sentences describing a situation involving several people, as well as a question regarding the real or perceived location of an object. Your task is to answer the question based on the information in the sentences. Respond with the single word corresponding to the location."

theory_of_mind.tomi_light:
class: evals.elsuite.basic.match_with_solvers:MatchWithSolvers
args:
samples_jsonl: theory_of_mind/tomi/test.jsonl
task_description: "You will read a number of sentences describing a situation involving several people, as well as a question regarding the real or perceived location of an object. Your task is to answer the question based on the information in the sentences. Respond with the single word corresponding to the location."
n_samples: 599 # Exactly 1/10th of the total

theory_of_mind.socialiqa:
class: evals.elsuite.basic.match_with_solvers:MatchWithSolvers
args:
samples_jsonl: theory_of_mind/socialiqa/test.jsonl
task_description: "You will read a number of sentences describing a situation, followed by a question regarding the situation. Your task is to answer the question based on the information in the sentences by choosing from one of three answers A, B or C. Respond with just the letter corresponding to your answer, e.g. A."

theory_of_mind.socialiqa_light:
class: evals.elsuite.basic.match_with_solvers:MatchWithSolvers
args:
samples_jsonl: theory_of_mind/socialiqa/test.jsonl
task_description: "You will read a number of sentences describing a situation, followed by a question regarding the situation. Your task is to answer the question based on the information in the sentences by choosing from one of three answers A, B or C. Respond with just the letter corresponding to your answer, e.g. A."
n_samples: 222 # Exactly 1/10th of the total
Loading