AttributeError: 'Runner' object has no attribute 'in_train'

akashgshastri · September 15, 2019, 11:11am

I’ve been trying to implement LRFinder. The class is as follows:

class LrFinder(Callback):
def __init__(self, max_iter = 100, min_lr = 1e-6, max_lr=10):
    self.max_iter, self.min_lr, self.max_lr = max_iter, min_lr, max_lr
    self.best_loss = 1e9
    
def begin_fit(self):
    if not self.in_train: return
    pos = self.n_iter/self.max_iter
    
    for pg in self.opt.param_groups:
        pg['lr'] = self.min_lr*(self.max_lr/self.min_lr)**pos
        
def after_step(self):
    if self.n_iter>self.max_iter or self.loss > self.best_loss*10:
        self.learn.stop = True
    if self.loss<self.best_loss:
        self.best_loss = self.loss

But i’m getting the following error:

<ipython-input-223-82a289ee3567> in <module>

----> 1 run.fit(1, learn)

in fit(self, epochs, learn)
276 try:
277 for cb in self.cbs: cb.set_runner(self)
–> 278 if self(‘begin_fit’): return
279 for epoch in range(epochs):
280 self.epoch = epoch

in call(self, cb_name)
292 for cb in sorted(self.cbs, key=lambda x: x._order):
293 f = getattr(cb, cb_name, None)
–> 294 if f and f(): return True
295 return False
296

in begin_fit(self)
5
6 def begin_fit(self):
----> 7 if not self.in_train: return
8 pos = self.n_iter/self.max_iter
9

in getattr(self, k)
191 _order=0
192 def set_runner(self, run): self.run=run
–> 193 def getattr(self, k): return getattr(self.run, k)
194 @property
195 def name(self):

AttributeError: ‘Runner’ object has no attribute ‘in_train’

However theres also the train eval class that uses self.in_train and that callback works fine. but when i use it suddenly theres no in_train attribute in runner.

Also when i call run.dict for runner without lr_finder, the dict is as shown below

{‘recorder’: <main.Recorder at 0x1b801e66b00>,
‘avg_stats’: <main.AvgStatsCallback at 0x1b801e66518>,
‘param_scheduler’: <main.ParamScheduler at 0x1b8227f8048>,
‘stop’: False,
‘cbs’: [<main.TrainEvalCallback at 0x1b801e66048>,
<main.Recorder at 0x1b801e66b00>,
<main.AvgStatsCallback at 0x1b801e66518>,
<main.ParamScheduler at 0x1b8227f8048>],
‘epochs’: 1,
‘learn’: None,
‘n_epochs’: 1.0009599999999808,
‘n_iter’: 782,
‘epoch’: 0,
‘in_train’: False,
‘iters’: 156.25,
‘xb’: tensor([[0., 0., 0., …, 0., 0., 0.],
[0., 0., 0., …, 0., 0., 0.],
[0., 0., 0., …, 0., 0., 0.],
…,
[0., 0., 0., …, 0., 0., 0.],
[0., 0., 0., …, 0., 0., 0.],
[0., 0., 0., …, 0., 0., 0.]]),
‘yb’: tensor([1, 2, 6, 0, 7, 8, 9, 2, 9, 5, 1, 8, 3, 5, 6, 8]),
‘pred’: tensor([[-3.5048, 8.0620, 0.9298, 0.2337, -2.3837, -1.7824, -2.6937, 1.4477,
1.2645, -2.0103],
[ 1.9852, -2.2797, 11.7437, 4.5844, -2.2583, -5.1891, -0.2871, -0.4410,
-1.5061, -5.4108],
[ 0.9055, -4.1201, 2.4212, -1.2287, 2.8501, 0.3424, 7.4508, -4.0461,
-1.5346, -2.0296],
[11.2557, -4.1623, 0.2949, 2.0499, -1.7708, 2.3185, 0.6972, -3.1247,
-4.7555, -1.6844],
[-0.5527, -4.1131, 0.0825, -1.1409, -1.3471, 1.1266, -5.6899, 6.2622,
2.1727, 1.8806],
[-2.7720, -1.2627, -1.0998, -1.2684, 0.4737, -0.6432, -0.7357, -1.5681,
7.3359, 1.9288],
[-3.1572, -4.3473, -2.5200, -0.9043, 3.2811, -1.8449, -4.3206, 4.2262,
1.3320, 7.8560],
[-0.3890, -2.0571, 6.5390, 0.2603, 0.3068, -1.8164, 2.4445, -1.7025,
0.4080, -2.2203],
[-2.4558, -3.5878, -2.3895, -0.7695, 1.7798, -1.1087, -3.3625, 3.7129,
1.0940, 6.3096],
[ 0.9948, -1.9186, -1.5197, 0.5942, -0.5894, 6.5472, 2.0266, -3.7192,
-2.1883, -1.4393],
[-3.1779, 5.7058, 0.5464, -0.2014, -1.0948, -0.6751, -2.0792, 1.1878,
0.7494, -1.5161],
[-1.1228, 0.8674, -1.1342, 3.6747, -1.9139, -2.3643, -2.4323, -3.2239,
9.0920, 0.2663],
[-1.2049, -0.1188, -0.7276, 11.2167, -4.6203, 2.3341, -5.1394, -7.0019,
4.7963, 0.5037],
[-0.8517, -4.1391, -7.5747, 2.8764, -3.1194, 11.7051, -2.7131, -3.8682,
0.9395, 3.2248],
[ 1.2470, -1.7122, 1.3160, -0.0583, 1.4987, 1.8983, 5.8249, -3.6058,
-2.8807, -2.4426],
[ 1.3920, -3.2590, 0.8273, 0.4972, -4.5312, -0.2438, -2.6957, -0.4090,
7.2001, 1.0601]]),
‘loss’: tensor(0.0214)}

AND IN_TRAIN IS RIGHT THERE!!!

PLEASE HELP ME I’M LOSING MY MIND !!!

riven314 · May 5, 2020, 2:24pm

Looking at the method name of your callback, I assume you are using fastai2.
For fastai2, I think the attribute you are looking for is training, not in_train. (I briefly looked through the related doc and source code, I can’t find in_train attribute).

Below is a callback example from source code. I believe this example could be useful for your reference.

It changes the training attribute to swap between train mode and eval mode. Note that if you wanna change the value of the training attribute, you have to change it via self.learn.training = True. (v.s. self.training = True, which will get a warning because it won’t make any change on training attribute that is effectively in use.

class TrainEvalCallback(Callback):
    "`Callback` that tracks the number of iterations done and properly sets training/eval mode"
    run_valid = False
    def begin_fit(self):
        "Set the iter and epoch counters to 0, put the model and the right device"
        self.learn.train_iter,self.learn.pct_train = 0,0.
        self.model.to(self.dls.device)

    def after_batch(self):
        "Update the iter counter (in training mode)"
        self.learn.pct_train += 1./(self.n_iter*self.n_epoch)
        self.learn.train_iter += 1

    def begin_train(self):
        "Set the model in training mode"
        self.learn.pct_train=self.epoch/self.n_epoch
        self.model.train()
        self.learn.training=True

    def begin_validate(self):
        "Set the model in validation mode"
        self.model.eval()
        self.learn.training=False

Conwyn · May 24, 2020, 10:43am

Hi Alex

This is fastai/course-v3 on GIT hub

This is the command

model_summary(run, learn, data)

Here is the error

AttributeError Traceback (most recent call last)
in ()
----> 1 model_summary(run, learn, data)

5 frames
/content/gdrive/My Drive/Colab Notebooks/exp/nb_05b.py in getattr(self, k)
10 _order=0
11 def set_runner(self, run): self.run=run
—> 12 def getattr(self, k): return getattr(self.run, k)
13
14 @property

AttributeError: ‘Runner’ object has no attribute ‘in_train’

Init signature: TrainEvalCallback(cb_name)
Source:
class TrainEvalCallback(Callback):
def begin_fit(self):
self.run.n_epochs=0.
self.run.n_iter=0

def after_batch(self):
    if not self.in_train: return
    self.run.n_epochs += 1./self.iters
    self.run.n_iter   += 1

def begin_epoch(self):
    self.run.n_epochs=self.epoch
    self.model.train()
    self.run.in_train=True

def begin_validate(self):
    self.model.eval()
    self.run.in_train=False

File: /content/gdrive/My Drive/Colab Notebooks/exp/nb_05b.py
Type: type

riven314 · May 24, 2020, 11:36am

Try initialize in_train attributes in your callback __init__ method

Conwyn · May 24, 2020, 8:22pm

Hi Alex

This exists in 04_callback as class CallbackerHandler when the value is set as start epoch but seems to disappear there after.

Regards Conwyn

jlloyd237 · June 18, 2020, 1:31pm

Hi

The problem appears to be that these are callbacks for the training loop, and to work properly all the events in the training loop need to fire in the right order. So, for the in_train variable to be initialized correctly, either the begin_validate or begin_epoch event needs to be fired first, so that one of the corresponding methods on TrainEvalCallback are called to initialize the variable (and this callback needs be earlier in the order-sorted list than any of the callbacks that make use of this variable).

However, in this particular example, these callbacks aren’t being used in the way they were originally intended (i.e., in the training loop), but are being invoked when the begin_batch event is explicitly fired in the get_batch method.

This is one of the problems with this style of callback programming where the different callbacks effectively have access to a shared memory: you need to make sure that things occur in the right order and that all of the assumptions are met (i.e., that certain events/handlers are invoked before other ones).

A simple way round this for the model_summary() function is just to explicitly set the parameter on the runner:

def model_summary(run, learn, data, find_all=False):
    run.in_train = False   # added explicit initialisation
    xb,yb = get_batch(data.valid_dl, run)
    device = next(learn.model.parameters()).device#Model may not be on the GPU yet
    xb,yb = xb.to(device),yb.to(device)
    mods = find_modules(learn.model, is_lin_layer) if find_all else learn.model.children()
    f = lambda hook,mod,inp,out: print(f"{mod}\n{out.shape}\n")
    with Hooks(mods, f) as hooks: learn.model(xb)