Developing a LLM response ranking reward model using HFRL except it's GPT-3.5 instead of human.
-
Updated
Dec 28, 2023 - Jupyter Notebook
Developing a LLM response ranking reward model using HFRL except it's GPT-3.5 instead of human.
Add a description, image, and links to the hfrl topic page so that developers can more easily learn about it.
To associate your repository with the hfrl topic, visit your repo's landing page and select "manage topics."