Callback discussion from lesson 9

Design patterns help me in coming up with good maintainable software code. I think general understanding of patterns available is useful to avoid rebuilding the same thing or solving the problem in a kludgy way. I think the point of automated tools and having good tests is also important and orthogonal to my original point.

This video is very helpful in understanding Fastai Callbacks:

5 Likes

I have just never found design patterns to be something to worry abut until after I get to the point that I want to refactor. If I start with a design pattern I end up spending too much time designing originally, then end up having to refactor the code anyway because of not having a great understanding of what the problem was originally. Eventually as I end up realizing what was wrong with my prototypes I end up organically using the correct design pattern, instead of applying the wrong one for the problem to begin with.

I feel automated tooling forces you to use good design patterns, and find it fairly difficult to write bad code when they are in place. They allow you to organically design your code using good design patterns. Which is why I believe they are not orthogonal.

I think you can go too far with over designing your code, especially when new to software development. The first company I worked for out of college basically wrote every single line of code to a design pattern. They did not have a good idea of what the application was supposed to do, so they ended up haphazardly applying design patterns to problems they did not yet have a good grasp on. It was the most amazingly abstracted application, that literally did absolutely nothing, except abstract/over complicate the underlying functionality. I feel design patterns really open the way to over design, and really we should be teaching people to apply automated tools, and refactor their code, instead of thinking about design patterns from the beginning, that is all.

I prefer to use design patterns as a reference for ideas on how to refactor code that needs it, instead of as a way to design it.

4 Likes

Totally agree :slight_smile: I had listen to one of webinar explaining that in Python some design patterns have no sense cos of dynamic nature and ducktypying it also depends on programming language own preferences there are different approaches to OOP programming, functional programming personally wouldnā€™t really worry too much about design pattern it easier to write it run in and if needed refactor it then overkill yourself :slight_smile:

I think while there is certainly a case of analysis paralysis with anything. Design patterns have nothing to do with it. I think the book nor the documentation doesnā€™t talk about over designing stuff. I think as in case of anything problem should be properly understood and solution should be applied pragmatically. Patterns help someone to know that the usual way to solve an issue is using a technique. Callbacks as mentioned by Jeremy in course is one example. Gang of 4 mention few other patterns and the way it could be applied, forcing something to conform unnaturally isnā€™t necessarily something design patterns is for. Anyways. i was just referring to using design patterns in my own experience as a reference and have built good maintainable code based on that.

I spent a fair bit of time working through the callbacks material today, and decided to write a blog post about it. The post explains how callbacks work and how to implement them in fast.ai.

This is my first deep learning blog post, so if anyone reads it, Iā€™d love to hear your feedback!

Hereā€™s a link to the post

15 Likes

Itā€™s like asking someone else to do a job for you, but also to keep you updated on the progress as theyā€™re working, giving you a chance to respond to what they are doing, if necessary.

Hi! I wrote a blog post about the __call__ method in the new Runner class and the connection between the general Callback class and Runner class.

I hope I got everything right and that the article is helpful if you donā€™t know how these lines work:

class Runner():
    def __init__(self, cbs=None, cb_funcs=None):
        cbs = listify(cbs)
        for cbf in listify(cb_funcs):
            cb = cbf()
            setattr(self, cb.name, cb)
            cbs.append(cb)
        self.stop,self.cbs = False,[TrainEvalCallback()]+cbs 
    # ...
    def __call__(self, cb_name):
        for cb in sorted(self.cbs, key=lambda x: x._order):
            f = getattr(cb, cb_name, None)
            if f and f(): return True
        return False

(I will only share this blog post here because it contains unreleased things specific to lesson 2)

2 Likes

Thanks thatā€™s very thoughtful. BTW you should skip ahead to 09b, since I realized Runner shouldnā€™t exist at all, so in the next week or two Iā€™ll be showing how to merge Runner and Learner - which is whatā€™s in that notebook. So you might prefer to write about that one instead. I simply moved the state from Learner to Runner, then renamed Runner->Learner.

When the MOOC comes out, you might want to publish this more widely - since at that point many more people will be able to appreciate it! :slight_smile:

If youā€™re never going to be working with more than one model at a time (and I imagine most people wonā€™t be) then combining them definitely makes sense. (09b is very clear and looks much more concise than having the two separately.) But I had reasoned to myself they were kept separate so that one could create collections of callbacks and then reuse them on different learners ā€“ is that not the case?

And Iā€™m wondering if this means that the CallbackHandler pattern thatā€™s in the latest version of the fastai library will probably eventually be replaced with the new merged LearnerRunner thingy? Iā€™m just trying to figure out what the end state will be (if you know ā€“ certainly possible that you donā€™t). I had a fair bit of trouble in class last week distinguishing in real time between

  • this is how we currently do things (it appears now that CallbackHandler falls into this category)
  • this is how we could do things (I guess the Runner pattern goes here now)
  • this is how we will do things in the future (LearnerRunner?)

Yes - but Iā€™ve found that Iā€™ve been always using cbfuncs for this.

I never know what the end state will be until we get there. Sylvain and I both like this new callback approach better than what we had, so itā€™s likely something like it will end up back in fastai.

2 Likes

Thanks for making the effort and writing! I was also spending some time on this and your point about why we need a generic callback class in the appendix filled another missing piece in my puzzle :slight_smile:

just my 10 cents concerning callback.

Implementing jeremyā€™s notebooks in swift i have experimentet with the thought ā€œcould we make everything in the training loop a callbackā€ and where would this take us ?
Here is an outline of the concepts:

  • an enum with all the stages in the training : begin_fit, begin_epoch, begin_batch, after_pred ā€¦
  • a Publisher that send Events to Subscriptions of each training stage in prioritized order
  • a Learner that loops though epochs (ie the fit call) and minibatches sending out events for each training stage using a Publisher to send events. The all subscribers can call stop on the learner
  • a Callback protocol to make it easier to make a struct/class that receives event

With this structure the calls to forward, backwards etc on a model just becomes a callback like a recorder callback but with priorities that ensure that it is called as the last or first callback in each stage.

Here is a a template for a trainable model and a Recorder callback (please mind my swift). They both implement an extension ā€œCallbackā€ that simplifies subscribing on many training stages. However they do not have to. Fx i you could extend existing module with just one new function to receive information about a training stage

struct TrainableModel<T>: Callback where T:BinaryFloatingPoint & TensorFlowScalar {
  let model:Model<T>
  init(model:Model<T>){
    self.model=model
  }
  
  var priorities: [Stage : Int]  = [
        Stage.begin_batch:   Int.min,   //do forward when everybody have been notified
        Stage.after_pred:    Int.min,   //calculate loss when everybody have been notified
        Stage.after_loss:    Int.min,   //do backward
        Stage.after_backward:Int.min,   //step optimize in prparation for the next batch. why do we do this here ?
        Stage.after_step:    Int.min,   //remove gradients for the minibatch
      ]
  
  public func begin_batch(    event:Event<T> ){ print("TrainableModel.begin_batch") }
  public func after_pred(     event:Event<T> ){ print("TrainableModel.after_pred")  }
  public func after_loss(     event:Event<T> ){ print("TrainableModel.after_loss")  }
  public func after_backward( event:Event<T> ){ print("TrainableModel.begin_batch") }
  public func after_step(     event:Event<T> ){ print("TrainableModel.after_step")  }
  public func after_batch(    event:Event<T> ){ print("TrainableModel.after_batch") }
}

struct Recorder<T> :Callback where T:BinaryFloatingPoint & TensorFlowScalar {
  var  priorities: [Stage : Int] = [Stage.after_loss: -1]
  public func after_loss( event:Event<T> ){ print("Recorder.recorder after_loss")  }
}

What did i learn:

  • removing all notifications from the trainable model create clarity and i love that i can replace the training and validation components easily
  • there are separate publish/subscribe frameworks that could extend the approach to remote subscriptions and asynchronous events. I could see how this could be used to create trainable model that compete or collaborate in a network. Remote dashboards etc.
  • i learn more when i create alternative implementations from the in-class notebooks. Also, it is a big advantage to have jeremys notebooks/videos when experimenting with alternative approaches in swift. Especially when i get lost in generics that drives me around :slight_smile: i like swift, solid code completion. Generics are the most difficult to get used to but they are a lot easier to work with than c++ templates
2 Likes

This makes callbacks seem easy, which is great!

Thanks for writing this great post. Can someone please explain why for TrainEvalCallback(), the first cb in cbs, if f and f(), and thus dunder call() and if self(ā€˜begin_fitā€™), returns False. Is it because f() returns None causing dunder call() to return False? Thanks in advance!

def fit(self, epochs, learn):
    self.epochs,self.learn = epochs,learn

    try:
        for cb in self.cbs: cb.set_runner(self)
        if self('begin_fit'): return
        for epoch in range(epochs):
            self.epoch = epoch
            if not self('begin_epoch'): self.all_batches(self.data.train_dl)

            with torch.no_grad(): 
                if not self('begin_validate'): self.all_batches(self.data.valid_dl)
            if self('after_epoch'): break
        
    finally:
        self('after_fit')
        self.learn = None

def __call__(self, cb_name):
    for cb in sorted(self.cbs, key=lambda x: x._order):
        f = getattr(cb, cb_name, None)
        if f and f(): return True
    return False
1 Like

This is explained well in the above blog titled Understanding callbacks in fastai

2 Likes

For those struggling with the concept at a foundational level (as I was), I have written a blog post breaking down the callback at a very basic level.

I refer to the stages where you can execute a callback (begin_fit, begin_epoch etc) as timestamps. It was the best term I could think of to explain what it is in simple terms. Is there an official name for these though?

5 Likes

I have often heard the term event, as callbacks are often used in GUI and correspond to physical event like ā€˜this button was clicked onā€™.

1 Like

ā€œThe line seems to be added to be explicit here and is not strictly necessary, since a function that does not return anything defaults to returning None , and None is False in Pythonā€. Got it thanks.

When sorting by _order, what happens when no _order is used ? TrainEvalCallback doesnā€™t have _order but that seems to be called first.