This repository has been archived by the owner on Sep 7, 2022. It is now read-only.
forked from CompVis/stable-diffusion
-
Notifications
You must be signed in to change notification settings - Fork 148
Benchmarking
hlky edited this page Aug 28, 2022
·
1 revision
Some people may be interested in the generation speed, vram usage of different settings (sampler, resolution, batch size)
This page details the start of a standardized benchmarking method and is subject to change
Example run
Hardware
- CPU: Xeon E5-2630v2
- RAM: 96GB DDR3
- GPU: NVIDIA RTX 3060 12GB
Software
- OS: Windows 10 LTSC 21H2
prompt:
anime girl holding a giant NVIDIA Tesla A100 GPU graphics card, Anime Blu-Ray boxart, super high detail
seed:
hlky
settings:
height, width = 512
cfg scale = 7.5
batch count, batch size = 1
sampling steps = 50
k_lms
Took 13.58s total (13.58s per image) Peak memory usage: 7615 MiB / 12288 MiB / 61.964%
Output:
k_heun
Took 24.09s total (24.09s per image) Peak memory usage: 7667 MiB / 12288 MiB / 62.388%
Output:
k_euler
Took 12.35s total (12.35s per image) Peak memory usage: 7663 MiB / 12288 MiB / 62.357%
Output:
k_euler_a
Took 12.36s total (12.36s per image) Peak memory usage: 7665 MiB / 12288 MiB / 62.373%
Output:
k_dpm_2
Took 24.18s total (24.18s per image) Peak memory usage: 7655 MiB / 12288 MiB / 62.291%
Output:
k_dpm_2_a
Took 24.16s total (24.16s per image) Peak memory usage: 7670 MiB / 12288 MiB / 62.417%
Output:
PLMS
Took 13.01s total (13.01s per image) Peak memory usage: 7665 MiB / 12288 MiB / 62.371%
Output:
DDIM
Took 12.4s total (12.4s per image) Peak memory usage: 7668 MiB / 12288 MiB / 62.395%
Output:
A template will be added and a Discussion opened to share results