Why are 4 different dataset options generated by default? #1175
Unanswered
slimy-pufflefish
asked this question in
Q&A
Replies: 1 comment
-
this is because models often train better to generalise across aspects / base resolutions when you include them in the training. DiT models like Flux or SD3 actually enter representation collapse rather easily when you go too far from their training sequence lengths (resolutions) crop=false just buckets things by aspect (image shape) and crop=true makes them all squares. training the same images in bucketed and square format helps the model avoid biasing any particular style or quality to a given resolution bucket, which is something that all Unet and DiT models do. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I used
configure.py
to generate amultidatabackend.json
. However, it generated 4 datasets, a 512 variant, a 1024 variant, and 2 versions that have random cropping enabled.I don't understand the differences between the variants with
"crop": false
and"crop": true
. I also don't understand what happens if an image is smaller than, e.g, 512, is it upscaled to be 512x512? Will this cause the model to output blurry results (since upscaling will make the image blurry).Beta Was this translation helpful? Give feedback.
All reactions