Replies: 9 comments 1 reply
-
Beta Was this translation helpful? Give feedback.
-
sorry, my copy-paste went to hell there. i have fixed all the images. here is another comparison, with however, it's worth noting that i'm doing a major overhaul of the model, and that incoherence isn't the fault of training the early timesteps. it is how the model was already. just pointing out that training on |
Beta Was this translation helpful? Give feedback.
-
looks like a very useful technique! I was guessing this must somehow related to the refiner. So you were saying the refiner was basically trained with the later 800 steps freezed, or in other words, infinitely biased towards the earlier 200 steps? What about batch size? Does a small batch size still work? |
Beta Was this translation helpful? Give feedback.
-
the refiner seems to be "capable" of doing the early noise schedule but it doesn't do it very well. i don't know if that's because it "Never" saw it, or because it was fine-tuned on just the final inference steps. small batch sizes have always been terrible but if you have a small dataset with a lot of visually-similar images, a large batch size might lead to overfitting. |
Beta Was this translation helpful? Give feedback.
-
can I change bias when continue from last checkpoint? |
Beta Was this translation helpful? Give feedback.
-
yes, it's got no impact in that way, unlike batch size or learning rate. |
Beta Was this translation helpful? Give feedback.
-
the argument "timestep_bias_begin" and "timestep_bias_end" requires |
Beta Was this translation helpful? Give feedback.
-
it's there, it's just missing in the help output |
Beta Was this translation helpful? Give feedback.
-
ok finially I found this old discussion. I am in a training run, the model shows sign of converge. Validation image's composition is almost fixed, most things looks correct at first glance, saturation is increasing. Some small details needs to be fixed, and I am wondering if this is the right moment to use time step bias for earlier steps. If yes, should lr be decreased or just keep it as it is? I would try and found out but would like to know more about the thing I am about to do. Thanks. |
Beta Was this translation helpful? Give feedback.
-
I have read the explaination in the source, but couldn't track it down where these arguments are used. Could you please provide more information of where and how they are used, and some recommendation/examples of what values can achieve what result?
Beta Was this translation helpful? Give feedback.
All reactions