Developer chat

I intend to modify learn.recorder.plot() to print to a file by adding a dest_file parameter. Any opinions/ guidance/ reasons not to? I’m currently having to train models on a PC where i cannot install jupyter and i need to see the output of lr_find().

There is already a return_fig argument you can use to get the picture, then you can save it in any file.

2 Likes

Apologies if I’m misunderstanding the code, but looks like there is an error in CNNLearner.has_pool_type:

def has_pool_type(m):
    if is_pool_type(m): return True
    for l in m.children(): return has_pool_type(l)
    return False

Looks like the for loop should only return if the recursive call is true. Currently it will only only check the first child at each level.

Yes, there is a bug indeed. Any fix accepted :wink:

Hey! I’m dealing with radiographic images that have a higher depth than 8 bit. Would it make sense to make the divisor in the open_image function an argument with the default div_int:int=255 ?

Hi, everyone! Is there any sample code on how to prepare data for bidirectional AWD_LSTM? I encounter this error, but don’t know how to change data_lm to work properly.

RuntimeError: Expected hidden[0] size (2, 32, 575), got (1, 32, 575)

Related question was asked here: Correct way to use bidirectional awd lstm language model learner?

Hi Sylvain,

I have submitted a PR (my first!) with a fix for this, and added a test for it. The test will create a dummy model with Resnet34 archi, and the bugfixed has_pool_type should return True for it (will return False with the bug that @TomB reported).

Hopefully the PR has followed the contribution guidelines. Thanks.

Yijin

I collided with this again. Should we add this to the Transformer config to set the alpha=0. by default?

It’s not in the config for now.

I have a possible dev implementation, I added onto recorder.plot another field, return_lr that returns the lr found by suggestion if it is true. Would this be valuable in the library?

I wanted to discuss here first before actually making a PR.

I have been thinking about making this change in the library but delayed it thinking it would be a minor thing. But I had to use this minor tweak in my personal code multiple times now, so thought it’s now time to discuss it here finally :slight_smile:

Problem

TrackerCallback (and all of it’s children) doesn’t care for us people who is doing multi-stage training and with Fast.ai philosophy we almost always do multi-stage training via transfer learning.

Whenever we do the following steps learn.callback_fns are re-initialzied and hence their best value is re-initialized to (inf, -inf) based on your mode.

learn.fit()
learn.freeze_to(-2); learn.fit()
learn.unfreeze(); learn.fit()

We actually don’t want this in multi stage training because if best value is re-initialized, then a previous best model can be overwritten by an inferior model. Most can argue that

“Oh, but most of the time your later stage training will eventually be better than the best model of a preceding stage.”

Yes, this is probably the case for 99% of the cases where you pick a good learning rate and don’t mess up anything in the later stages, or your model actually benefits from training even the earliest layers.

But imagine running automated scripts, with auto lr find and running hundreds of experiments. In this case you wouldn’t have the guarantee of doing sanity checks. Yes, reading logs is a great way of seeing what happened but still why lose your best model even if it is only a bit better? It just doesn’t feel ok to me to let this go honestly :slight_smile:

Fix

Fix is super simple.

  1. Add best_init to TrackerCallback:
class TrackerCallback(LearnerCallback):
    "A `LearnerCallback` that keeps track of the best value in `monitor`."
    def __init__(self, learn:Learner, monitor:str='valid_loss', mode:str='auto', best_init:float=None):
        super().__init__(learn)
        self.monitor,self.mode = monitor,mode
        if self.mode not in ['auto', 'min', 'max']:
            warn(f'{self.__class__} mode {self.mode} is invalid, falling back to "auto" mode.')
            self.mode = 'auto'
        mode_dict = {'min': np.less, 'max':np.greater}
        mode_dict['auto'] = np.less if 'loss' in self.monitor else np.greater
        self.operator = mode_dict[self.mode]
        self.best_init = best_init

    def on_train_begin(self, **kwargs:Any)->None:
        "Initializes the best value."
        if not self.best_init: 
            self.best = float('inf') if self.operator == np.less else -float('inf')
        else: self.best = self.best_init

  1. New workflow
# 2) initialize TrackerCallback s with intermediate best value
def tracker_init(learn, best_init):
    callback_fns = []
    for cb_fn in learn.callback_fns:
        if cb_fn.func.__base__ == fastai.callbacks.tracker.TrackerCallback:
            cb_fn.keywords["best_init"] = best_init
            cb_fn = partial(cb_fn.func, **cb_fn.keywords)
        callback_fns.append(cb_fn)
    learn.callback_fns = callback_fns
learn.fit()
learn = tracker_init(learn, learn.save_model_callback.best)
learn.freeze_to(-2); learn.fit()
learn = tracker_init(learn, learn.save_model_callback.best)
learn.unfreeze(); learn.fit()

Also we should add tracker_kwargs to each callback that inherits from TrackerCallback to allow initialization with new best_init argument

class EarlyStoppingCallback(TrackerCallback):
    "A `TrackerCallback` that terminates training when monitored quantity stops improving."
    def __init__(self, learn:Learner, monitor:str='valid_loss', mode:str='auto', min_delta:int=0, patience:int=0, **tracker_kwargs):
        super().__init__(learn, monitor=monitor, mode=mode, **tracker_kwargs)
        self.min_delta,self.patience = min_delta,patience
        if self.operator == np.less:  self.min_delta *= -1

Let me know what you think about this. Would it be really helpful? Am I missing anything? Let’s discuss!

Thanks

I’d put on hold any PR on callbacks since the mechanics are going to change a lot soon, when we start implementing the changes for v1.1.
Also note that if you instantiate your callback outside of the learner, and pass it in callbacks, it’s not going to be reset at each new training.

1 Like

Makes lot of sense thanks :slight_smile:

Would it be ok to explore interpretations prior to v1.1 release for different type of tasks additional to ClassificationInterpretation? Maybe one parent class Interpretation for different tasks and implementing MultilabelClassification, Segmentation, etc… ?

I think it would be really nice to have a lot of good tools to have for post training to analyze and improve models further.

1 Like

That seems worth investigating yes. Actually it seems a very good idea :slight_smile:

If you make them all inherit from an Interpretation object that gather the predictions, losses and targets (with a from_learner class method like we have right now), then the tweaks you would have to do to keep it working with 1.1 would be minimal (if any).

1 Like

Hi everyone,

Not sure if this is the right topic to ask, but I’d like to know if it is OK to use the code from ImageCleaner in order to create a widget and how to give credits to the devs.

The widget will help people labeling images interactively inside a notebook. It is going to be open source and available on GitHub.

Also, I never contributed to a project before but I’d happily integrate the widget into the fastai library if you want to.

Cheers!
Mario

When will be released 1.1? are we talking about releasing it in days/weeks/months ?

A few weeks normally.

So, I followed your guidance.

  • Created a parent Interpretation class with from_learner and top_losses methods so far. Might add more methods which are task agnostic, e.g. sort by a given metric, etc…

  • Made current ClassificationInterpretation inherit from it.

  • Created skeletons with NotImplementedError for MultiLabelClassificationInterpretation, ObjectDetectionInterpretation, SegmentationInterpretation.

  • Put everything to a module interpret.py.

  • Currently working on SegmentationInterpretation:

Should I create a PR with what I’ve done so far so that anyone can contribute, plus core devs can also review for feedback? Or should I keep working on it?

Thanks

Yes please, push what you have for now. It’s easier to merge incremental PRs.

1 Like