Best way to schedule a custom additional parameter during training?

MicPie · July 9, 2019, 5:11am

Hello everybody,

I would like to schedule a custom additional parameter during training. What would be the best approach in fastai v1 (current latest release)?

The TrainingPhase and General scheduler seem to be designed for scheduling the hyper parameters of the optimizer (“You can also add any hyper-parameter that is in your optimizer…”).
Can those classes also be used to schedule parameters which are not part of the optimizer?

However, one option would be to setup a wrapper around the optimizer with my parameter and I access it from the the “outside”.

I am happy for tips and tricks on this topic!

muellerzr · July 9, 2019, 5:25am

Correct me here, but those are just callbacks. I believe you could just implement a custom callback that adjusts your parameter at one of the 10 phases that the Callback Scheduler has available to it, no? Looking at the source code it seems that way to me. Perhaps some inspiration?

github.com

fastai/fastai/blob/master/fastai/callbacks/general_sched.py#L8


from ..core import *
from ..callback import *
from ..basic_train import Learner, LearnerCallback


__all__ = ['GeneralScheduler', 'TrainingPhase']


@dataclass
class TrainingPhase():
    "Schedule hyper-parameters for a phase of `length` iterations."
    length:int
    
    def __post_init__(self): self.scheds = dict()
    def schedule_hp(self, name, vals, anneal=None):
        "Adds a schedule for `name` between `vals` using `anneal`."
        self.scheds[name] = Scheduler(vals, self.length, anneal)
        return self


class GeneralScheduler(LearnerCallback):

MicPie · July 14, 2019, 4:08pm

Based on the OneCycleScheduler this seems to do the trick:

class OneCycleXScheduler(LearnerCallback):
    def __init__(self, learn:Learner, X_max:float=1.0, div_factor:float=25., pct_start:float=0.75,
                 final_div:float=None, tot_epochs:int=None, start_epoch:int=None):
        super().__init__(learn)
        self.X_max,self.div_factor,self.pct_start,self.final_div = X_max,div_factor,pct_start,final_div
        if self.final_div is None: self.final_div = div_factor*1e4
        if is_listy(self.X_max): self.X_max = np.array(self.X_max)
        self.start_epoch, self.tot_epochs = start_epoch, tot_epochs

    def steps(self, *steps_cfg:StartOptEnd):
        "Build anneal schedule for all of the parameters."
        return [Scheduler(step, n_iter, func=func)
                for (step,(n_iter,func)) in zip(steps_cfg, self.phases)]

    def on_train_begin(self, n_epochs:int, epoch:int, **kwargs:Any)->None:
        "Initialize our optimization params based on our annealing schedule."
        self.start_epoch = ifnone(self.start_epoch, epoch)
        self.tot_epochs = ifnone(self.tot_epochs, n_epochs)
        n = len(self.learn.data.train_dl) * self.tot_epochs
        a1 = int(n * self.pct_start)
        a2 = n-a1
        self.phases = ((a1, annealing_cos), (a2, annealing_no)) # CHANGE HERE FOR FUNCTION! annealing_cos, annealing_linear
        low_X = self.X_max/self.div_factor
        self.X_scheds = self.steps((low_X, self.X_max), (self.X_max, self.X_max/self.final_div))
        self.opt = self.learn.opt
        self.opt.X = self.X_scheds[0].start
        self.idx_s = 0
        self.opt.Xs = []
    
    def jump_to_epoch(self, epoch:int)->None:
        for _ in range(len(self.learn.data.train_dl) * epoch):
            self.on_batch_end(True)

    def on_batch_end(self, train, **kwargs:Any)->None:
        "Take one step forward on the annealing schedule for the optim params."
        if train:
            self.opt.X = -self.X_scheds[self.idx_s].step()
            self.opt.Xs.append(self.opt.X)
            if self.X_scheds[self.idx_s].is_done:
                self.idx_s += 1

However, I am not sure if this is the best and most elegant way.
If somebody has suggestions, tips, or tricks, I am happy to hear them!