Lesson 9B: Math of Diffusion

Hey all! This is to discuss the supplementary Math of Stable Diffusion (“Lesson 9B”) part of Lesson 9.

This is a wiki topic - please feel free to add your favourite relevant links and chat about anything related to the math of stable diffusion.

<<< Lesson 8 Lesson 9 | Lesson 9A | Lesson 10 >>>

Lesson resources

Links from the lesson

More info

28 Likes

@seem i don’t think this is has been made a Wiki topic yet

3 Likes

It was mentioned that when getting the VAE latent embeddings, the constant 0.18125 was used to scale the latents in the original paper. Was there a reason this specific number was picked (i.e. it has some property), or was it more “we tried many values and this one seemed to work the best”?

3 Likes

Here is an explanation directly from the lead author/developer of latent diffusion and Stable Diffusion:

We introduced the scale factor in the latent diffusion paper. The goal was to handle different latent spaces (from different autoencoders, which can be scaled quite differently than images) with similar noise schedules. The scale_factor ensures that the initial latent space on which the diffusion model is operating has approximately unit variance. Hope this helps :slight_smile:

9 Likes

I have been grokking the math for DDPM paper too. I finally understand how to derive the error formulation!

The paper that has helped me the most is [2208.11970] Understanding Diffusion Models: A Unified Perspective. To further map the math to the code Denoising Diffusion Probabilistic Models (DDPM) was also super helpful.

7 Likes

Got it. So it’s determined from sampling the two encoders and determining the appropriate scaling factor to normalize between both. Super helpful!

2 Likes

The ELBO intuition in this paper is very solid. https://arxiv.org/pdf/1906.02691.pdf

5 Likes

I like Understanding Diffusion Models: A Unified Perspective, although it took some time (and pain) to go through. The author went through every single line of math with some sort of annotation or explaination, without skipping any step. That helps to take away a lot of guesswork.

3 Likes

FYI I have a whole thread of curated resources delving into the math and intuition of diffusion models.

Since this thread was published there have been even more amazing resources so I probably may publish a new thread with some newer ones as well.

12 Likes

100%. This level of detail is not needed to train/inference on stable diffusion. However, this is the perfect resource for people who want to go deep and understand math fully.

2 Likes

The Lesson 9B video by @ilovescience and @seem is now available in the top post, and also here:

12 Likes

I just added this talk on the 2015 paper by Jascha Sohl-Dickstein (lead author) to the wiki, but wanted to highlight here since I think it’s great and I haven’t seen it mentioned before:

7 Likes

Here’s some helpful links about SDEs:

4 Likes