Using history in fine tuning #6021

herandy · 2024-11-13T21:26:51Z

herandy
Nov 13, 2024

So I am trying to integrate Chain of Verification into supervised fine tuning. Basically I need to have the model output several different answers based on some prompts, and then ask it to use those and come up with a final answer. So I don't actually have the data for the history as the model needs to come up with them by itself. So, I only have the input, maybe some instruction for each individual history (optional) and the final output, is there a way to create a dataset like this where the model decides by itself what those assistant outputs for the histories would be?

"history": [
  ["human instruction in the first round (optional)", "model response in the first round (optional)"],
  ["human instruction in the second round (optional)", "model response in the second round (optional)"]
]

So I don't want to provide the "model response in the x round (optional)" as it should be up to the model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using history in fine tuning #6021

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Using history in fine tuning #6021

herandy Nov 13, 2024

Replies: 0 comments

herandy
Nov 13, 2024