Lesson 21 official topic

This is a wiki post - feel free to edit to add links from the lesson or other useful info.

<<< Lesson 20Lesson 22 >>>

Lesson resources

17 Likes

Jeremy FYI DDIM nb still has betamax of 0.02 and transformi -1 to +1

Great lesson Thanks

Super cool to see how easy it was to use a callback to send stuff to WandB. I had written this quick and dirty CB to do something similar for tensorflow and so far it just works:

#|export 
from torch.utils.tensorboard import SummaryWriter

class TensorboardCB(Callback):
    order = MetricsCB.order + 1    
    def __init__(self, name=None): self.writer = SummaryWriter(comment=f'_{name}')

    def after_batch(self, learner: Learner):        
        # Log loss
        train = 'train' if learn.model.training else 'validation'
        idx = learn.dl_len*learn.epoch + learn.iter
        self.writer.add_scalar(f'loss/{train}', learn.loss.item(), idx)     
        self.writer.flush()

    def after_epoch(self, learn: Learner):
        if hasattr(learn, 'recorder'):
            # Log all other metrics after each epoch
            d = learn.recorder[-1]
            for k, v in d.items():
                if k == 'loss': continue
                self.writer.add_scalar(f'{k}/{d.train}', v, d.epoch)    
            self.writer.flush()
            
    def after_fit(self, learner: Learner): self.writer.close()

And this is the result on the TB side:

It’s cool to be able to see the different runs and all, but after playing with Tensorboard a bit, it’s become clear that that the solutions offered by WandB as well as others seem much more complete. One big omission is being able to associate a set of hyperparams to each run, and be able to easily separate different projects. Tensorboard has had this issue open since ~2017 . Nonetheless, getting the callback to work was a fun exercise.

2 Likes

Hey Jeremy I have used cosine scheduler but it is not showing any images. I think im doing something wrong please help. I have seen most of the videos they are so well explained. Thanks in advance

In ddim_step(), I noticed the eta float value passed in from the sample() function is factored into sigma, but I didn’t notice any explanation for it. I was wondering if this is meant to be a scheduler at some point, but for now it seems useless. If anyone has any insight, the code is below from notebook 20!

def ddim_step(x_t, t, noise, abar_t, abar_t1, bbar_t, bbar_t1, eta):
    vari = ((bbar_t1/bbar_t) * (1-abar_t/abar_t1))
    sig = vari.sqrt()*eta
    x_0_hat = ((x_t-bbar_t.sqrt()*noise) / abar_t.sqrt())
    x_t = abar_t1.sqrt()*x_0_hat + (bbar_t1-sig**2).sqrt()*noise
    if t>0: x_t += sig * torch.randn(x_t.shape).to(x_t)
    return x_t

@torch.no_grad()
def sample(f, model, sz, n_steps, skips=1, eta=1.):
    tsteps = list(reversed(range(0, n_steps, skips)))
    x_t = torch.randn(sz).to(model.device)
    preds = []
    for i,t in enumerate(progress_bar(tsteps)):
        abar_t1 = abar[tsteps[i+1]] if t > 0 else torch.tensor(1)
        noise = model(x_t,t).sample
        x_t = f(x_t, t, noise, abar[t], abar_t1, 1-abar[t], 1-abar_t1, eta)
        preds.append(x_t.float().cpu())
    return preds

Edit: I see now that the eta is found in Eq. 16 of the DDIM paper and it’s meant to manage the stochasticisty of the process. Eta=1 will give DDPM and eta=0 is 100% deterministic, if I’m not mistaken.

I found a bug with Johno’s wandb code. The ‘self.train’ line won’t work because it doesn’t exist and the wandb object doesn’t have access to the ‘learner’ object. Should we add ‘learner’ to ‘wandb’?

can someone share the .pkl files?