Naifu Diffusion is the name for this project of finetuning Stable Diffusion on images and captions.
Still under testing, see config/test.yaml
for any configuration.
Colab example: https://colab.research.google.com/drive/1Xf1tnsP4fu8y5MoYbK1pz08jmyMiTrvv
Currently implemented features:
- Aspect Ratio Bucket and Custom Batch
- Using Hidden States of CLIP’s Penultimate Layer
- Nai-style tag processing
- Extending the Stable Diffusion Token Limit by 3x (beta)
deployment notes
There is no need to prepare datasets and models by default, the script will download automatically.Clone repo
git clone https://github.com/Mikubill/naifu-diffusion
cd naifu-diffusion
Fulfill deps
# by conda
conda env create -f environment.yaml
conda activate nai
# OR by pip
pip install -r requirements.txt
Start training.
# test
torchrun trainer.py --model_path=/tmp/model --config config/test.yaml
# For multi-gpu
torchrun trainer.py --model_path=/tmp/model --config config/multi-gpu.yaml
# Disitrubuted
torchrun trainer.py --model_path=/tmp/model --config config/distributed.yaml
Convert checkpoint files to use in SD-based webui
python convert.py --src /path/last.ckpt --dst /path/last.ckpt