About the evaluation method of CMATH. #17

Eternity666 · 2024-09-20T08:41:29Z

As for CMATH's evaluation, the prompting method is 6-shot in the Table 2 of your technical report. However, in your open-source evaluation code, it seems to use 0-shot. Additionally, the official paper of CMATH utiltizes 0-shot evaluation.
So I would like to ask which method is correct? If 6-shot is correct, could you provide the used demonstrations? Thanks a lot!

ToheartZhang · 2024-10-12T02:13:04Z

We evaluate the base models in the 6-shot setting and the instruct models in the zero-shot setting. The repo provides evaluation for the instruct models.

Eternity666 · 2024-10-15T08:33:27Z

Thanks for your reply! Could you please provide the 6 demonstrations used in your evaluation process? This allows me align the evaluation methods with your models. Thank you very much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the evaluation method of CMATH. #17

About the evaluation method of CMATH. #17

Eternity666 commented Sep 20, 2024 •

edited

Loading

ToheartZhang commented Oct 12, 2024

Eternity666 commented Oct 15, 2024

About the evaluation method of CMATH. #17

About the evaluation method of CMATH. #17

Comments

Eternity666 commented Sep 20, 2024 • edited Loading

ToheartZhang commented Oct 12, 2024

Eternity666 commented Oct 15, 2024

Eternity666 commented Sep 20, 2024 •

edited

Loading