Design patterns help me in coming up with good maintainable software code. I think general understanding of patterns available is useful to avoid rebuilding the same thing or solving the problem in a kludgy way. I think the point of automated tools and having good tests is also important and orthogonal to my original point.
This video is very helpful in understanding Fastai Callbacks:
I have just never found design patterns to be something to worry abut until after I get to the point that I want to refactor. If I start with a design pattern I end up spending too much time designing originally, then end up having to refactor the code anyway because of not having a great understanding of what the problem was originally. Eventually as I end up realizing what was wrong with my prototypes I end up organically using the correct design pattern, instead of applying the wrong one for the problem to begin with.
I feel automated tooling forces you to use good design patterns, and find it fairly difficult to write bad code when they are in place. They allow you to organically design your code using good design patterns. Which is why I believe they are not orthogonal.
I think you can go too far with over designing your code, especially when new to software development. The first company I worked for out of college basically wrote every single line of code to a design pattern. They did not have a good idea of what the application was supposed to do, so they ended up haphazardly applying design patterns to problems they did not yet have a good grasp on. It was the most amazingly abstracted application, that literally did absolutely nothing, except abstract/over complicate the underlying functionality. I feel design patterns really open the way to over design, and really we should be teaching people to apply automated tools, and refactor their code, instead of thinking about design patterns from the beginning, that is all.
I prefer to use design patterns as a reference for ideas on how to refactor code that needs it, instead of as a way to design it.
Totally agree I had listen to one of webinar explaining that in Python some design patterns have no sense cos of dynamic nature and ducktypying it also depends on programming language own preferences there are different approaches to OOP programming, functional programming personally wouldnāt really worry too much about design pattern it easier to write it run in and if needed refactor it then overkill yourself
I think while there is certainly a case of analysis paralysis with anything. Design patterns have nothing to do with it. I think the book nor the documentation doesnāt talk about over designing stuff. I think as in case of anything problem should be properly understood and solution should be applied pragmatically. Patterns help someone to know that the usual way to solve an issue is using a technique. Callbacks as mentioned by Jeremy in course is one example. Gang of 4 mention few other patterns and the way it could be applied, forcing something to conform unnaturally isnāt necessarily something design patterns is for. Anyways. i was just referring to using design patterns in my own experience as a reference and have built good maintainable code based on that.
I spent a fair bit of time working through the callbacks material today, and decided to write a blog post about it. The post explains how callbacks work and how to implement them in fast.ai.
This is my first deep learning blog post, so if anyone reads it, Iād love to hear your feedback!
Itās like asking someone else to do a job for you, but also to keep you updated on the progress as theyāre working, giving you a chance to respond to what they are doing, if necessary.
Hi! I wrote a blog post about the __call__
method in the new Runner class and the connection between the general Callback class and Runner class.
I hope I got everything right and that the article is helpful if you donāt know how these lines work:
class Runner():
def __init__(self, cbs=None, cb_funcs=None):
cbs = listify(cbs)
for cbf in listify(cb_funcs):
cb = cbf()
setattr(self, cb.name, cb)
cbs.append(cb)
self.stop,self.cbs = False,[TrainEvalCallback()]+cbs
# ...
def __call__(self, cb_name):
for cb in sorted(self.cbs, key=lambda x: x._order):
f = getattr(cb, cb_name, None)
if f and f(): return True
return False
(I will only share this blog post here because it contains unreleased things specific to lesson 2)
Thanks thatās very thoughtful. BTW you should skip ahead to 09b, since I realized Runner
shouldnāt exist at all, so in the next week or two Iāll be showing how to merge Runner
and Learner
- which is whatās in that notebook. So you might prefer to write about that one instead. I simply moved the state from Learner to Runner, then renamed Runner->Learner.
When the MOOC comes out, you might want to publish this more widely - since at that point many more people will be able to appreciate it!
If youāre never going to be working with more than one model at a time (and I imagine most people wonāt be) then combining them definitely makes sense. (09b is very clear and looks much more concise than having the two separately.) But I had reasoned to myself they were kept separate so that one could create collections of callbacks and then reuse them on different learners ā is that not the case?
And Iām wondering if this means that the CallbackHandler pattern thatās in the latest version of the fastai library will probably eventually be replaced with the new merged LearnerRunner thingy? Iām just trying to figure out what the end state will be (if you know ā certainly possible that you donāt). I had a fair bit of trouble in class last week distinguishing in real time between
- this is how we currently do things (it appears now that CallbackHandler falls into this category)
- this is how we could do things (I guess the Runner pattern goes here now)
- this is how we will do things in the future (LearnerRunner?)
Yes - but Iāve found that Iāve been always using cbfuncs
for this.
I never know what the end state will be until we get there. Sylvain and I both like this new callback approach better than what we had, so itās likely something like it will end up back in fastai.
Thanks for making the effort and writing! I was also spending some time on this and your point about why we need a generic callback class in the appendix filled another missing piece in my puzzle
just my 10 cents concerning callback.
Implementing jeremyās notebooks in swift i have experimentet with the thought ācould we make everything in the training loop a callbackā and where would this take us ?
Here is an outline of the concepts:
- an enum with all the stages in the training : begin_fit, begin_epoch, begin_batch, after_pred ā¦
- a Publisher that send Events to Subscriptions of each training stage in prioritized order
- a Learner that loops though epochs (ie the fit call) and minibatches sending out events for each training stage using a Publisher to send events. The all subscribers can call stop on the learner
- a Callback protocol to make it easier to make a struct/class that receives event
With this structure the calls to forward, backwards etc on a model just becomes a callback like a recorder callback but with priorities that ensure that it is called as the last or first callback in each stage.
Here is a a template for a trainable model and a Recorder callback (please mind my swift). They both implement an extension āCallbackā that simplifies subscribing on many training stages. However they do not have to. Fx i you could extend existing module with just one new function to receive information about a training stage
struct TrainableModel<T>: Callback where T:BinaryFloatingPoint & TensorFlowScalar {
let model:Model<T>
init(model:Model<T>){
self.model=model
}
var priorities: [Stage : Int] = [
Stage.begin_batch: Int.min, //do forward when everybody have been notified
Stage.after_pred: Int.min, //calculate loss when everybody have been notified
Stage.after_loss: Int.min, //do backward
Stage.after_backward:Int.min, //step optimize in prparation for the next batch. why do we do this here ?
Stage.after_step: Int.min, //remove gradients for the minibatch
]
public func begin_batch( event:Event<T> ){ print("TrainableModel.begin_batch") }
public func after_pred( event:Event<T> ){ print("TrainableModel.after_pred") }
public func after_loss( event:Event<T> ){ print("TrainableModel.after_loss") }
public func after_backward( event:Event<T> ){ print("TrainableModel.begin_batch") }
public func after_step( event:Event<T> ){ print("TrainableModel.after_step") }
public func after_batch( event:Event<T> ){ print("TrainableModel.after_batch") }
}
struct Recorder<T> :Callback where T:BinaryFloatingPoint & TensorFlowScalar {
var priorities: [Stage : Int] = [Stage.after_loss: -1]
public func after_loss( event:Event<T> ){ print("Recorder.recorder after_loss") }
}
What did i learn:
- removing all notifications from the trainable model create clarity and i love that i can replace the training and validation components easily
- there are separate publish/subscribe frameworks that could extend the approach to remote subscriptions and asynchronous events. I could see how this could be used to create trainable model that compete or collaborate in a network. Remote dashboards etc.
- i learn more when i create alternative implementations from the in-class notebooks. Also, it is a big advantage to have jeremys notebooks/videos when experimenting with alternative approaches in swift. Especially when i get lost in generics that drives me around
i like swift, solid code completion. Generics are the most difficult to get used to but they are a lot easier to work with than c++ templates
This makes callbacks seem easy, which is great!
Thanks for writing this great post. Can someone please explain why for TrainEvalCallback(), the first cb in cbs, if f and f(), and thus dunder call() and if self(ābegin_fitā), returns False. Is it because f() returns None causing dunder call() to return False? Thanks in advance!
def fit(self, epochs, learn):
self.epochs,self.learn = epochs,learn
try:
for cb in self.cbs: cb.set_runner(self)
if self('begin_fit'): return
for epoch in range(epochs):
self.epoch = epoch
if not self('begin_epoch'): self.all_batches(self.data.train_dl)
with torch.no_grad():
if not self('begin_validate'): self.all_batches(self.data.valid_dl)
if self('after_epoch'): break
finally:
self('after_fit')
self.learn = None
def __call__(self, cb_name):
for cb in sorted(self.cbs, key=lambda x: x._order):
f = getattr(cb, cb_name, None)
if f and f(): return True
return False
This is explained well in the above blog titled Understanding callbacks in fastai
For those struggling with the concept at a foundational level (as I was), I have written a blog post breaking down the callback at a very basic level.
I refer to the stages where you can execute a callback (begin_fit, begin_epoch etc) as timestamps. It was the best term I could think of to explain what it is in simple terms. Is there an official name for these though?
I have often heard the term event, as callbacks are often used in GUI and correspond to physical event like āthis button was clicked onā.
āThe line seems to be added to be explicit here and is not strictly necessary, since a function that does not return anything defaults to returning None
, and None
is False
in Pythonā. Got it thanks.
When sorting by _order, what happens when no _order is used ? TrainEvalCallback doesnāt have _order but that seems to be called first.