I feel like this is the last part of the SD mystery for me (hopefully). I was going through the deep dive into SD notebook happily until I inspected the internals of scheduler.step
. The following is the bit I’m confused about
pred_original_sample = sample - sigma * model_output
# 2. Convert to an ODE derivative
derivative = (sample - pred_original_sample) / sigma
self.derivatives.append(derivative)
if len(self.derivatives) > order:
self.derivatives.pop(0)
# 3. Compute linear multistep coefficients
order = min(step_index + 1, order)
lms_coeffs = [self.get_lms_coefficient(order, step_index, curr_order) for curr_order in range(order)]
# 4. Compute previous sample based on the derivatives path
prev_sample = sample + sum(
coeff * derivative for coeff, derivative in zip(lms_coeffs, reversed(self.derivatives))
)
I originally thought that the first statement was all that was needed. Could someone explain to me what’s happening in the subsequent steps? Do they relate to the alpha’s mentioned in papers somehow?
Would appreciate any guidance/ blog post on this, unless of course JH explained this bit in his videos in the first place?