Tricks to speed up SDXL during development

I like how Jeremy always tries things on small dataset etc to quickly test things working, rather than waiting hours.

Any tricks/ideas for how to get stabilityai/stable-diffusion-xl-base-1.0 faster? I have P5000 on Paperspace and even inference (generating single image) takes a couple of minutes (I need to generate a couple of hundreds of them).

I can lower the number of steps (get reasonable images in 30 steps instead of 50), but looking for more ideas. Once I’m confident everything is working, I can scale things back up and wait.

Can I generate smaller images? Use lower precision floats (now fp16)? Any other ideas?

More context, I’m trying to reproduce (Paper page - The Chosen One: Consistent Characters in Text-to-Image Diffusion Models)

1 Like

Hi, my feedback will be harsh. P5000 has NOT tensor computing instruction. Look at the comparison table of RTX5000, p5000 and A4000. P5000 has performance in mixed precision FP16 in scale of GFLOPS! Change GPU, try T4 at the Colab or the Kaggle (Kaggle has 1-2 x T4 - 30 hours per week with no charge). Good one is also RTX5000, works well in half precision. Judge yourself:

*table generated with AI