SDXL improvements (and support for Draft+) [DRAFT PR] #9543

rohitrango · 2024-06-26T00:21:58Z

What does this PR do ?

This PR adds additional functionality to SDXL class of models to support Draft+, and:

bug fix for adapter control in LoRA (LinearParallelAdapter)
conversion script for SDXL from huggingface weights to nemo (existing converter script doesnt work)
additional inference config file to load converted hf weights
added FSDP strategy for multimodal models
function to return adapter config, similar to function that returns the adapter itself

Code definitely needs cleanup.

Collection: [Note which collection this PR will affect]

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

…cess * also change `get_gamma` as a new function to use inside other functions which may interact with sampling (e.g. draft+)

* added 'from_nemo' config for VAE

nemo/collections/multimodal/models/text_to_image/stable_diffusion/ldm/autoencoder.py

nemo/collections/multimodal/modules/stable_diffusion/diffusionmodules/openaimodel.py

nemo/collections/multimodal/parts/utils.py

scripts/checkpoint_converters/convert_stablediffusion_hf_to_nemo.py

nemo/collections/multimodal/modules/stable_diffusion/diffusionmodules/openaimodel.py

scripts/checkpoint_converters/convert_stablediffusion_hf_to_nemo.py

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

scripts/checkpoint_converters/convert_stablediffusion_hf_to_nemo.py

+
+from nemo.utils import logging
+
+intkey = lambda x: int(x)


* add slurm files to .gitignore * add differentiable decode to SDXL VAE * Optionally return predicted noise during the single step sampling process * also change `get_gamma` as a new function to use inside other functions which may interact with sampling (e.g. draft+) * debugging sdunet converter script * Added SD/SDXL conversion script from HF to NeMo * added 'from_nemo' config for VAE * tmp commit, please make changes (oci is super slow, cannot even run vim) * new inference yaml works * add logging to autoencoder * !(dont squash) Added enabling support for LinearWrapper for SDLoRA * added samples_per_batch and fsdp arguments to SDXL inference * added extra optionally wrapper to FSDP * remove unncessary comments * remove unnecessary comments * Apply isort and black reformatting Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> --------- Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> Co-authored-by: Rohit Jena <rohitkumarj@nvidia.com> Co-authored-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* add slurm files to .gitignore * add differentiable decode to SDXL VAE * Optionally return predicted noise during the single step sampling process * also change `get_gamma` as a new function to use inside other functions which may interact with sampling (e.g. draft+) * debugging sdunet converter script * Added SD/SDXL conversion script from HF to NeMo * added 'from_nemo' config for VAE * tmp commit, please make changes (oci is super slow, cannot even run vim) * new inference yaml works * add logging to autoencoder * !(dont squash) Added enabling support for LinearWrapper for SDLoRA * added samples_per_batch and fsdp arguments to SDXL inference * added extra optionally wrapper to FSDP * remove unncessary comments * remove unnecessary comments * Apply isort and black reformatting Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> --------- Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> Co-authored-by: Rohit Jena <rohitkumarj@nvidia.com> Co-authored-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> Signed-off-by: Rohit Jena <rohit.rango@gmail.com>

* add slurm files to .gitignore * add differentiable decode to SDXL VAE * Optionally return predicted noise during the single step sampling process * also change `get_gamma` as a new function to use inside other functions which may interact with sampling (e.g. draft+) * debugging sdunet converter script * Added SD/SDXL conversion script from HF to NeMo * added 'from_nemo' config for VAE * tmp commit, please make changes (oci is super slow, cannot even run vim) * new inference yaml works * add logging to autoencoder * !(dont squash) Added enabling support for LinearWrapper for SDLoRA * added samples_per_batch and fsdp arguments to SDXL inference * added extra optionally wrapper to FSDP * remove unncessary comments * remove unnecessary comments * Apply isort and black reformatting Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> --------- Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> Co-authored-by: Rohit Jena <rohitkumarj@nvidia.com> Co-authored-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* add slurm files to .gitignore * add differentiable decode to SDXL VAE * Optionally return predicted noise during the single step sampling process * also change `get_gamma` as a new function to use inside other functions which may interact with sampling (e.g. draft+) * debugging sdunet converter script * Added SD/SDXL conversion script from HF to NeMo * added 'from_nemo' config for VAE * tmp commit, please make changes (oci is super slow, cannot even run vim) * new inference yaml works * add logging to autoencoder * !(dont squash) Added enabling support for LinearWrapper for SDLoRA * added samples_per_batch and fsdp arguments to SDXL inference * added extra optionally wrapper to FSDP * remove unncessary comments * remove unnecessary comments * Apply isort and black reformatting --------- Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> Signed-off-by: Rohit Jena <rohit.rango@gmail.com> Co-authored-by: Rohit Jena <rohitkumarj@nvidia.com> Co-authored-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> Co-authored-by: Pablo Garay <palenq@gmail.com> Co-authored-by: Terry Kong <terryk@nvidia.com>

* add slurm files to .gitignore * add differentiable decode to SDXL VAE * Optionally return predicted noise during the single step sampling process * also change `get_gamma` as a new function to use inside other functions which may interact with sampling (e.g. draft+) * debugging sdunet converter script * Added SD/SDXL conversion script from HF to NeMo * added 'from_nemo' config for VAE * tmp commit, please make changes (oci is super slow, cannot even run vim) * new inference yaml works * add logging to autoencoder * !(dont squash) Added enabling support for LinearWrapper for SDLoRA * added samples_per_batch and fsdp arguments to SDXL inference * added extra optionally wrapper to FSDP * remove unncessary comments * remove unnecessary comments * Apply isort and black reformatting Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> --------- Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> Co-authored-by: Rohit Jena <rohitkumarj@nvidia.com> Co-authored-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* add slurm files to .gitignore * add differentiable decode to SDXL VAE * Optionally return predicted noise during the single step sampling process * also change `get_gamma` as a new function to use inside other functions which may interact with sampling (e.g. draft+) * debugging sdunet converter script * Added SD/SDXL conversion script from HF to NeMo * added 'from_nemo' config for VAE * tmp commit, please make changes (oci is super slow, cannot even run vim) * new inference yaml works * add logging to autoencoder * !(dont squash) Added enabling support for LinearWrapper for SDLoRA * added samples_per_batch and fsdp arguments to SDXL inference * added extra optionally wrapper to FSDP * remove unncessary comments * remove unnecessary comments * Apply isort and black reformatting Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> --------- Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> Co-authored-by: Rohit Jena <rohitkumarj@nvidia.com> Co-authored-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> Signed-off-by: Tugrul Konuk <ertkonuk@gmail.com>

* add slurm files to .gitignore * add differentiable decode to SDXL VAE * Optionally return predicted noise during the single step sampling process * also change `get_gamma` as a new function to use inside other functions which may interact with sampling (e.g. draft+) * debugging sdunet converter script * Added SD/SDXL conversion script from HF to NeMo * added 'from_nemo' config for VAE * tmp commit, please make changes (oci is super slow, cannot even run vim) * new inference yaml works * add logging to autoencoder * !(dont squash) Added enabling support for LinearWrapper for SDLoRA * added samples_per_batch and fsdp arguments to SDXL inference * added extra optionally wrapper to FSDP * remove unncessary comments * remove unnecessary comments * Apply isort and black reformatting Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> --------- Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> Co-authored-by: Rohit Jena <rohitkumarj@nvidia.com> Co-authored-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com> Signed-off-by: tonyjie <jl4257@cornell.edu>

rohitrango and others added 11 commits June 25, 2024 15:27

add slurm files to .gitignore

7bf8ef7

add differentiable decode to SDXL VAE

08248eb

Optionally return predicted noise during the single step sampling pro…

f88961b

…cess * also change `get_gamma` as a new function to use inside other functions which may interact with sampling (e.g. draft+)

debugging sdunet converter script

be203c3

Added SD/SDXL conversion script from HF to NeMo

45a399c

* added 'from_nemo' config for VAE

tmp commit, please make changes (oci is super slow, cannot even run vim)

62bd74a

new inference yaml works

f2db166

add logging to autoencoder

7c66b14

!(dont squash) Added enabling support for LinearWrapper for SDLoRA

6507702

added samples_per_batch and fsdp arguments to SDXL inference

398a18e

added extra optionally wrapper to FSDP

0b69f4b

github-actions bot added core Changes to NeMo Core NLP Multi Modal labels Jun 26, 2024

ericharper requested a review from yaoyu-33 July 1, 2024 17:28

yaoyu-33 reviewed Jul 2, 2024

View reviewed changes

nemo/collections/multimodal/models/text_to_image/stable_diffusion/ldm/autoencoder.py Show resolved Hide resolved

yaoyu-33 reviewed Jul 2, 2024

View reviewed changes

nemo/collections/multimodal/modules/stable_diffusion/diffusionmodules/openaimodel.py Outdated Show resolved Hide resolved

yaoyu-33 reviewed Jul 2, 2024

View reviewed changes

nemo/collections/multimodal/parts/utils.py Outdated Show resolved Hide resolved

remove unncessary comments

f73abe3

ericharper added the Run CICD label Jul 6, 2024

github-advanced-security bot found potential problems Jul 6, 2024

View reviewed changes

remove unnecessary comments

f411298

rohitrango requested a review from yaoyu-33 July 8, 2024 18:55

terrykong mentioned this pull request Jul 8, 2024

Draft+ for SDXL [draft] NVIDIA/NeMo-Aligner#222

Merged

8 tasks

yaoyu-33 and others added 2 commits July 8, 2024 14:58

Merge branch 'main' into sdxl_draft

188686c

Apply isort and black reformatting

d3e728e

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

github-advanced-security bot found potential problems Jul 8, 2024

View reviewed changes

scripts/checkpoint_converters/convert_stablediffusion_hf_to_nemo.py

from nemo.utils import logging

intkey = lambda x: int(x)

Check notice

Code scanning / CodeQL

Unnecessary lambda Note

This 'lambda' is just a simple wrapper around a callable object. Use that object directly.

yaoyu-33 added Run CICD and removed Run CICD labels Jul 8, 2024

yaoyu-33 approved these changes Jul 9, 2024

View reviewed changes

yaoyu-33 merged commit 1c73e1b into NVIDIA:main Jul 9, 2024
113 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDXL improvements (and support for Draft+) [DRAFT PR] #9543

SDXL improvements (and support for Draft+) [DRAFT PR] #9543

rohitrango commented Jun 26, 2024

SDXL improvements (and support for Draft+) [DRAFT PR] #9543

SDXL improvements (and support for Draft+) [DRAFT PR] #9543

Conversation

rohitrango commented Jun 26, 2024

What does this PR do ?

Changelog

Usage

GitHub Actions CI

Before your PR is "Ready for review"

Who can review?

Additional Information