Skip to content

A magnitude-preserving implementation of Diffusion Transformers, resulting in faster convergence and improved performance.

Notifications You must be signed in to change notification settings

ericbill21/map-dit

Repository files navigation

MaP-DiT: Magnitude-Preserving Diffusion Transformer

This project builds upon key concepts from the following research papers:

  • Peebles & Xie (2023) explore the application of transformer architectures to diffusion models, achieving state-of-the-art performance on various generation tasks;
  • Karras et al. (2024) introduce the idea of preserving the magnitude of features during the diffusion process, enhancing the stability and quality of generated outputs.

Preliminary Results

Below, we present some preliminary results of using magnitude preservation (right) vs. not using magnitude preservation (left) with DiT-S/2 on the ImageNet-128 dataset. Note that DiT-S/2 is a very small model, so the samples are not of high quality. However, MaP-DiT displays much higher quality and consistency than vanilla DiT.

               

Fig 1. DiT-S/2 samples of Jay without (left) and with (right) magnitude preserving layers.

               

Fig 2. DiT-S/2 samples of Macaw without (left) and with (right) magnitude preserving layers.

               

Fig 3. DiT-S/2 samples of St. Bernard without (left) and with (right) magnitude preserving layers.

               

Fig 4. DiT-S/2 samples of Mushroom without (left) and with (right) magnitude preserving layers.

Training

python train.py --data-path /path/to/data --results-dir /path/to/results --model DiT-S/2 --num-steps 400_000 <map feature flags>

Sampling

python sample.py --result-dir /path/to/results/<dir> --class-label <class label>

About

A magnitude-preserving implementation of Diffusion Transformers, resulting in faster convergence and improved performance.

Topics

Resources

Stars

Watchers

Forks

Languages