You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So I am trying to integrate Chain of Verification into supervised fine tuning. Basically I need to have the model output several different answers based on some prompts, and then ask it to use those and come up with a final answer. So I don't actually have the data for the history as the model needs to come up with them by itself. So, I only have the input, maybe some instruction for each individual history (optional) and the final output, is there a way to create a dataset like this where the model decides by itself what those assistant outputs for the histories would be?
"history": [
["human instruction in the first round (optional)", "model response in the first round (optional)"],
["human instruction in the second round (optional)", "model response in the second round (optional)"]
]
So I don't want to provide the "model response in the x round (optional)" as it should be up to the model
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
So I am trying to integrate Chain of Verification into supervised fine tuning. Basically I need to have the model output several different answers based on some prompts, and then ask it to use those and come up with a final answer. So I don't actually have the data for the history as the model needs to come up with them by itself. So, I only have the input, maybe some instruction for each individual history (optional) and the final output, is there a way to create a dataset like this where the model decides by itself what those assistant outputs for the histories would be?
Beta Was this translation helpful? Give feedback.
All reactions