add evaluation of PRM trained by TRL #12

CJReinforce · 2025-01-22T10:32:13Z

Add the evaluation of PRM trained by the PRMTrainer of TRL.

I reproduced Qwen2.5-Math-7B-PRM800K using the PRMTranier of TRL. The performance of the reproduced PRM evaluated by run_eval_prm_trl.py are:

gsm8k error acc: 46.9, correct acc: 96.4, f1: 63.1
math error acc: 55.9, correct acc: 82.0, f1: 66.5
olympiadbench error acc: 39.0, correct acc: 67.8, f1: 49.6
omnimath error acc: 34.4, correct acc: 66.8, f1: 45.4
ProcessBench average f1: 56.1

add evaluation of PRM trained by TRL

04d347e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add evaluation of PRM trained by TRL #12

add evaluation of PRM trained by TRL #12

CJReinforce commented Jan 22, 2025

add evaluation of PRM trained by TRL #12

Are you sure you want to change the base?

add evaluation of PRM trained by TRL #12

Conversation

CJReinforce commented Jan 22, 2025