🔥🔥🔥 FreeScale is a tuning-free method for higher-resolution visual generation, unlocking the 8k image generation!
Haonan Qiu, Shiwei Zhang*, Yujie Wei, Ruihang Chu, Hangjie Yuan,
Xiang Wang, Yingya Zhang, and Ziwei Liu†
(* Project Leader, † Corresponding Author)
From Alibaba Group and Nanyang Technological University.
conda create -n freescale python=3.8
conda activate freescale
pip install -r requirements.txt
🤗 Quick start with Gradio
gradio gradio_app.py
- Modify the
run_freescale.py
and input the following commands in the terminal. - Input the following commands in terminal:
python run_freescale.py
# resolutions_list: resolutions for each stage of self-cascade upscaling.
# cosine_scale: detail scale, usually 1.0 ~ 2.0. For 8k image generation, cosine_scale <= 1.0 is recommended.
- Modify the
run_sdxl.py
and generate the base image with the original resolutions. - Input the following commands in terminal:
python run_sdxl.py
- Put the generated image into folder
imgen_intermediates
. - (Optional) Generate the mask using other segmentation models (e.g., Segment Anything) and put the mask into folder
imgen_intermediates
. - Modify the
run_freescale_imgen.py
and generate the final image with the higher resolutions. - Input the following commands in terminal:
python run_freescale_imgen.py
# resolutions_list: resolutions for each stage of self-cascade upscaling.
# cosine_scale: detail scale for foreground, usually 2.0 ~ 3.0.
# cosine_scale_bg: detail scale for background, usually 0.5 ~ 1.0.
- Modify the
run_freescale_turbo.py
and input the following commands in the terminal. - Input the following commands in terminal:
python run_freescale_turbo.py
# num_inference_steps: 2 ~ 8.
# Currently, the resolution that exceeds 2048 x 2048 will introduce quality loss in the Turbo mode.
- Generating 8k (8192 x 8192) images will cost around 55 GB and 1 hour on NVIDIA A800.
- Set
fast_mode = True
can significantly shorten the time but lead to some loss of quality especially for 8k image generation. - For 8k image generation,
cosine_scale <= 1.0
is recommended. Or use the Flexible Control for Detail Level function and set a smallcosine_scale_bg
(e.g., 0.5) for areas with artifacts. - Potentially, real images or images generated by other models (e.g., FLUX) can be used as the intermediates of Flexible Control for Detail Level. In this way, FreeScale becomes an img-to-img approach. However, since SDXL may not be able to reconstruct the given content well, it is easy to make unexpected changes. Finding the prompt that allows SDXL to reconstruct the given content as much as possible is particularly important for the quality of the generation.
If your have any questions about FreeScale, feel free to contact Haonan Qiu.
- [2024.12.22]: 🔥🔥 Release FreeScale for SDXL-Turbo, trading slight quality loss for a significant speedup.
- [2024.12.13]: 🔥🔥 Release FreeScale (based on SDXL), higher-resolution image generation!
@article{qiu2024freescale,
title={FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion},
author={Qiu, Haonan and Zhang, Shiwei and Wei, Yujie and Chu, Ruihang and Yuan, Hangjie and Wang, Xiang and Zhang, Yingya and Liu, Ziwei},
journal={arXiv preprint arXiv:2412.09626},
year={2024}
}