Why are 4 different dataset options generated by default? #1175

slimy-pufflefish · 2024-11-22T09:56:30Z

slimy-pufflefish
Nov 22, 2024

I used configure.py to generate a multidatabackend.json. However, it generated 4 datasets, a 512 variant, a 1024 variant, and 2 versions that have random cropping enabled.

I don't understand the differences between the variants with "crop": false and "crop": true. I also don't understand what happens if an image is smaller than, e.g, 512, is it upscaled to be 512x512? Will this cause the model to output blurry results (since upscaling will make the image blurry).

bghira · 2024-11-22T14:46:00Z

bghira
Nov 22, 2024
Maintainer

this is because models often train better to generalise across aspects / base resolutions when you include them in the training.

DiT models like Flux or SD3 actually enter representation collapse rather easily when you go too far from their training sequence lengths (resolutions)

crop=false just buckets things by aspect (image shape) and crop=true makes them all squares.

training the same images in bucketed and square format helps the model avoid biasing any particular style or quality to a given resolution bucket, which is something that all Unet and DiT models do.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why are 4 different dataset options generated by default? #1175

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Why are 4 different dataset options generated by default? #1175

slimy-pufflefish Nov 22, 2024

Replies: 1 comment

bghira Nov 22, 2024 Maintainer

slimy-pufflefish
Nov 22, 2024

bghira
Nov 22, 2024
Maintainer