-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/EulerScheduler #138
Feature/EulerScheduler #138
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for submitting your work! I can see that most of the job has been completed.
And don't forget to lint your code with black
and ruff
(and check types with pyright).
poetry run black .
poetry run ruff . --fix
src/refiners/foundationals/latent_diffusion/schedulers/euler.py
Outdated
Show resolved
Hide resolved
src/refiners/foundationals/latent_diffusion/schedulers/euler.py
Outdated
Show resolved
Hide resolved
I made an end-to-end test with the following script: from pathlib import Path
from refiners.fluxion.utils import manual_seed
from refiners.foundationals.latent_diffusion.schedulers.euler import EulerScheduler
from refiners.foundationals.latent_diffusion.stable_diffusion_1.model import StableDiffusion_1
import torch
torch.set_grad_enabled(False)
test_device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
hub = Path("/mnt/ssd2/hub/finegrain/stable-diffusion-1-5")
scheduler = EulerScheduler(30, device=test_device)
sd = StableDiffusion_1(device=test_device, scheduler=scheduler)
sd.unet.load_from_safetensors(hub / "unet.safetensors")
sd.lda.load_from_safetensors(hub / "lda.safetensors")
sd.clip_text_encoder.load_from_safetensors(hub / "CLIPTextEncoderL.safetensors")
n_steps = 30
prompt = "a cute cat, detailed high-quality professional image"
negative_prompt = "lowres, bad anatomy, bad hands, cropped, worst quality"
clip_text_embedding = sd.compute_clip_text_embedding(text=prompt, negative_text=negative_prompt)
sd.set_num_inference_steps(n_steps)
manual_seed(2)
x = torch.randn(1, 4, 64, 64, device=test_device)
for step in sd.steps:
x = sd(
x,
step=step,
clip_text_embedding=clip_text_embedding,
condition_scale=7.5,
)
predicted_image = sd.lda.decode_latents(x)
predicted_image.save("cute_cat_euler.png") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some parts are missing - I checked that adding those should solve the issue.
src/refiners/foundationals/latent_diffusion/schedulers/euler.py
Outdated
Show resolved
Hide resolved
src/refiners/foundationals/latent_diffusion/schedulers/euler.py
Outdated
Show resolved
Hide resolved
src/refiners/foundationals/latent_diffusion/schedulers/euler.py
Outdated
Show resolved
Hide resolved
I came back to the 'epsilon' prediction and added the scaling functions. With this, I was able to generate a couple of images, but the algorithm is still quite unstable. The image below, for instance, was generated with s_churn=1.0 and s_noise=1.1 (I set these values to the default for now, so you can recreate it). You can test it using the following code: from pathlib import Path
from refiners.fluxion.utils import manual_seed
from refiners.foundationals.latent_diffusion.schedulers.euler import EulerScheduler
from refiners.foundationals.latent_diffusion.stable_diffusion_1.model import StableDiffusion_1
from tqdm import tqdm
import torch
torch.set_grad_enabled(False)
test_device = torch.device("mps")
hub = Path("./tests/weights")
n_steps = 30
scheduler = EulerScheduler(n_steps, device=test_device)
sd = StableDiffusion_1(device=test_device, scheduler=scheduler)
sd.unet.load_from_safetensors(hub / "unet.safetensors")
sd.lda.load_from_safetensors(hub / "lda.safetensors")
sd.clip_text_encoder.load_from_safetensors(hub /
"CLIPTextEncoderL.safetensors")
prompt = "a cute cat, detailed high-quality professional image"
negative_prompt = "lowres, bad anatomy, bad hands, cropped, worst quality"
clip_text_embedding = sd.compute_clip_text_embedding(
text=prompt, negative_text=negative_prompt)
sd.set_num_inference_steps(n_steps)
manual_seed(2)
x = torch.randn(1, 4, 64, 64, device=test_device)
x = x * scheduler.init_noise_sigma
for step in tqdm(sd.steps):
x = sd(
x,
step=step,
clip_text_embedding=clip_text_embedding,
condition_scale=7.5,
)
predicted_image = sd.lda.decode_latents(x)
predicted_image.save("cute_cat_euler.png") Notice that I'm scaling the latents before the loop, and I'm also adding the scaling before the unet. This is the solution that is used in diffusers, and for the other schedulers, the scaling function just returns the same input (here) |
Thanks! I will have a close look, stay tuned. In the meanwhile: could you please sync your main branch with https://github.com/finegrain-ai/refiners and then rebase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/refiners/foundationals/latent_diffusion/schedulers/euler.py
Outdated
Show resolved
Hide resolved
src/refiners/foundationals/latent_diffusion/schedulers/euler.py
Outdated
Show resolved
Hide resolved
Hey @israfelsr, are you done with it? i.e. is it ready for (final) review? Thanks! |
Ready for the final review! 🙌🏽 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see additional comments. But also, could you please rebase on main so as to incorporate latest changes?
src/refiners/foundationals/latent_diffusion/schedulers/euler.py
Outdated
Show resolved
Hide resolved
src/refiners/foundationals/latent_diffusion/schedulers/euler.py
Outdated
Show resolved
Hide resolved
src/refiners/foundationals/latent_diffusion/schedulers/euler.py
Outdated
Show resolved
Hide resolved
src/refiners/foundationals/latent_diffusion/schedulers/euler.py
Outdated
Show resolved
Hide resolved
src/refiners/foundationals/latent_diffusion/schedulers/euler.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look at the two final comments. We'd be good to go right after. Thanks!
src/refiners/foundationals/latent_diffusion/schedulers/euler.py
Outdated
Show resolved
Hide resolved
…feature/euler-scheduler
…r/refiners into feature/euler-scheduler
Hi!
This PR implement the EulerScheduler for Stable Diffusion. I based the implemenation in Elucidating the Design Space of Diffusion-Based Generative Models and tested the behaviour against the implementation in Diffusers.
Euler needs some extra inputs in the
step
function:s_t_min
,s_t_max
,s_churn
ands_noise
. I hardcoded the necessary ones for now. Should I add them with the suggested default? The rest is working as expected!The code to test the implementation is on the scheduler test file. You can directly call the test from
test_schedulers
.