Help understanding databunch in custom callback

cdparks · February 19, 2020, 8:42pm

I am attempting to create a custom callback that modifies the data in my train_df and valid_df and constructs a new databunch at the end of every epoch. Currently, it looks something like this:

@dataclass
class ModifyDB(LearnerCallback):
    """Save Latest Model"""
    def __init__(self, learn:Learner, df0t, df0v, tok, vocab):
        super().__init__(learn)
        #---training data df
        self.df0t = df0t
        #---validation data df
        self.df0v = df0v
        self.tok = tok
        self.vocab = vocab
    def modify_df( db):
        #---modify data in df
        return db
    def on_epoch_end(self, epoch:int, **kwargs):
        dfCt = self.modify_df( self.df0t)
        dfCv = self.modify_df( self.df0v)
        dataNew = TextLMDataBunch.from_df('.', dfCt, dfCv, bs=64, tokenizer=self.tok, text_cols='smiles', min_freq=1, include_bos=False, include_eos=False, vocab = self.vocab)
        self.learn.data = dataNew
        del dataNew
        print('we are now exitting!')

Now, from my runs this appears to be working. However, the FastAI docs section on writing custom call backs says:

Note that this allows the callback user to just pass your callback name to callback_fns when constructing their Learner , since that always passes self when constructing callbacks from callback_fns . In addition, by passing the learner, this callback will have access to everything: e.g all the inputs/outputs as they are calculated, the losses, and also the data loaders, the optimizer, etc. At any time:

Changing self.learn.data.train_dl or self.data.valid_dl will change them inside the fit function (we just need to pass the DataBunch object to the fit function and not data.train_dl/data.valid_dl)

Changing self.learn.opt.opt (We have an OptimWrapper on top of the actual optimizer) will change it inside the fit function.

Changing self.learn.data or self.learn.opt directly WILL NOT change the data or the optimizer inside the fit function.**

I cannot seem to square my code that appears to work with the docs that states modifying the databunch should not actually work! Could someone clarify for me what the docs is trying to communicate here, and whether what I am doing is indeed valid.

ejamesb · February 24, 2020, 2:38am

Did you get this to work? I am attempting to change the databunch in the on_batch_end callback function. But, I am struggling to get it to work. I could try to change the train_dl and valid_dl.

cdparks · February 24, 2020, 4:05am

I am still fiddling as well. Mine appears to work, but I do not like that I do not understand how it is working, given the comments in the docs.

It is funny you ask that, because I was thinking about that exact problem earlier. I do not think that will work because the on_batch_end is inside a for loop that looks like this

for xb,yb in progress_bar(learn.data.train_dl, parent=pbar):

So if you are changing learn.data.train_dl inside the for loop that is iterating over that variable, that may cause issues. I am not positive though.

ejamesb · February 24, 2020, 4:51am

I see. Ultimately, I am just trying to apply different transforms on different batches. So, I made a switch transform callback and have tried different ways: setting learner.data, setting learner.train_dl, and learner.train_dl.tfms. But, I am running into seemingly a gpu memory leak since eventually I run into a gpu out of memory error around 1 epoch in. I may have to do some more digging into the fast.ai library to see what is going on.