This is a wiki post - feel free to edit to add links from the lesson or other useful info.
Jeremy FYI DDIM nb still has betamax of 0.02 and transformi -1 to +1
Great lesson Thanks
Super cool to see how easy it was to use a callback to send stuff to WandB. I had written this quick and dirty CB to do something similar for tensorflow and so far it just works:
#|export
from torch.utils.tensorboard import SummaryWriter
class TensorboardCB(Callback):
order = MetricsCB.order + 1
def __init__(self, name=None): self.writer = SummaryWriter(comment=f'_{name}')
def after_batch(self, learner: Learner):
# Log loss
train = 'train' if learn.model.training else 'validation'
idx = learn.dl_len*learn.epoch + learn.iter
self.writer.add_scalar(f'loss/{train}', learn.loss.item(), idx)
self.writer.flush()
def after_epoch(self, learn: Learner):
if hasattr(learn, 'recorder'):
# Log all other metrics after each epoch
d = learn.recorder[-1]
for k, v in d.items():
if k == 'loss': continue
self.writer.add_scalar(f'{k}/{d.train}', v, d.epoch)
self.writer.flush()
def after_fit(self, learner: Learner): self.writer.close()
And this is the result on the TB side:
It’s cool to be able to see the different runs and all, but after playing with Tensorboard a bit, it’s become clear that that the solutions offered by WandB as well as others seem much more complete. One big omission is being able to associate a set of hyperparams to each run, and be able to easily separate different projects. Tensorboard has had this issue open since ~2017 . Nonetheless, getting the callback to work was a fun exercise.
Hey Jeremy I have used cosine scheduler but it is not showing any images. I think im doing something wrong please help. I have seen most of the videos they are so well explained. Thanks in advance
In ddim_step(), I noticed the eta float value passed in from the sample() function is factored into sigma, but I didn’t notice any explanation for it. I was wondering if this is meant to be a scheduler at some point, but for now it seems useless. If anyone has any insight, the code is below from notebook 20!
def ddim_step(x_t, t, noise, abar_t, abar_t1, bbar_t, bbar_t1, eta):
vari = ((bbar_t1/bbar_t) * (1-abar_t/abar_t1))
sig = vari.sqrt()*eta
x_0_hat = ((x_t-bbar_t.sqrt()*noise) / abar_t.sqrt())
x_t = abar_t1.sqrt()*x_0_hat + (bbar_t1-sig**2).sqrt()*noise
if t>0: x_t += sig * torch.randn(x_t.shape).to(x_t)
return x_t
@torch.no_grad()
def sample(f, model, sz, n_steps, skips=1, eta=1.):
tsteps = list(reversed(range(0, n_steps, skips)))
x_t = torch.randn(sz).to(model.device)
preds = []
for i,t in enumerate(progress_bar(tsteps)):
abar_t1 = abar[tsteps[i+1]] if t > 0 else torch.tensor(1)
noise = model(x_t,t).sample
x_t = f(x_t, t, noise, abar[t], abar_t1, 1-abar[t], 1-abar_t1, eta)
preds.append(x_t.float().cpu())
return preds
Edit: I see now that the eta is found in Eq. 16 of the DDIM paper and it’s meant to manage the stochasticisty of the process. Eta=1 will give DDPM and eta=0 is 100% deterministic, if I’m not mistaken.
I found a bug with Johno’s wandb code. The ‘self.train’ line won’t work because it doesn’t exist and the wandb object doesn’t have access to the ‘learner’ object. Should we add ‘learner’ to ‘wandb’?
can someone share the .pkl
files?