Flexible Learner Inheritance Idea

Ezno · February 13, 2023, 9:10pm

I am experimenting with creating the most flexible Learner possible with everything being callbacks. I am a few weeks behind, so I apologize if this is covered in the next few weeks! I think the challenge with having the most flexible learner possible with everything as callbacks is that when you go to create a new Learner subclassing can be challenging.

Jeremy showed an approach using setattr to look for callbacks if methods did not exist and defining methods during subclassing. I have been taking a different approach (minimal ex. shown below). I think solves the issue largely, but please let me know what I am missing!

In the code below notice how the init signature is updated on NewLearner. Thoughts?

In the example above, I can add any callbacks by defining the subclass_method method. As long as I use the class decorator that updates the __init__ method with the new arguments the init signature is updated as well. By doing this I can have a Learner with everything as very flexible callbacks, and subclass by defining logic and adding any callbacks needed in the subclass_method function.

jeremy · February 14, 2023, 6:00am

It would be great to try creating the same CBs as in the course and see if you can make all the notebooks work – see whether you feel like the experience is better or worse than what’s there now. I find it hard to guess whether it works better in practice until I try using an API for a while

Ezno · February 14, 2023, 12:19pm

I am building it all into GitHub - Isaac-Flath/isaacai with the intent to build the whole library with this approach (assuming it continues to pan out).

I have not built a ton yet and it’s lagging behind the course, but once I get a wider variety of callbacks built I’ll write a blog post or something to share how it turned out.

Ezno · February 15, 2023, 7:58pm

While I’m going through this I’ll document interesting findings/complications here, in case useful to others. This one sure taught me something about how inheritance works in python that I found very surprising!

During inheritance, the __init__ method of the child class is a reference and not a copy of the parent classes __init__. What this means is the approach above not only updates the init signature of the new child class, but also updates the parent class at the same time (not ideal)!

To get around this I need to do a deepcopy of the function, though I also learned copy.deepcopy does not work on function. The code below create a copy of the function, and so makes the whole thing work without changing the parent class inadvertently.

def copy_func(f):
    """Based on http://stackoverflow.com/a/6528148/190597 (Glenn Maynard)"""
    g = types.FunctionType(f.__code__, f.__globals__, name=f.__name__,
                           argdefs=f.__defaults__,
                           closure=f.__closure__)
    g = functools.update_wrapper(g, f)
    g.__kwdefaults__ = f.__kwdefaults__
    return g

def init_delegates(learner,method='subclassing_method'):
    learner.__init__ = fc.delegates(getattr(learner,method))(copy_func(learner.__init__))
    return learner

jeremy · February 16, 2023, 4:17am

FYI copy_func is part of fastcore.

Ezno · February 20, 2023, 3:07am

I’ve ended up going a different route. The issue with the route I was previously using can get really messy with multiple layers of abstractions. I don’t want a system where adding abstractions makes things harder, so I came up with another approach.

Instead of focusing on the trainer, I am focusing callbacks. This works by adding a recursive function call when I add a callback, to add any callbacks in the callbacks attribute of that callback. Here’s a minimal example of a class I can pass to my Learner that will add 5 callbacks.

class CoreCBs:
    def __init__(self,device=def_device,module_filter=fc.noop,**metrics):
        self.callbacks = [DeviceCB(device=device),
                          BasicTrainCB(),
                          MetricsCB(**metrics),
                          ProgressCB(),
                          ActivationStatsCB(module_filter)]

In this example the CoreCBs class just adds the 5 callbacks, but it’s not actually a callback itself (no callback method such as before_batch), but it could if I had some functionality that should be added based on this particular combination of callback. In addition, each of the callbacks added could have their own callbacks attributes that would also be added to the Learner callbacks as well if/when I need deeper abstractions.

I can also subclass the Learner by creating a callbacks attribute there, but only plan to do 1 subclass down and not go deeper. The init method appends to the callbacks attribute and doesn’t overwrite it - the base class just starts with an empty list. Ideally all the bulk of the abstractions happens by grouping the callbacks together as opposed to subclassing learners.

I have other ideas for how to make this better - but don’t want to complicate it unless I need to so am going to see how far this takes me first. We will see how it turns out!

Ezno · February 28, 2023, 12:32am

I’ve created a little diagram to demonstrate how pytorch stuff is combined in the Trainer, and then the trainer executes callbacks. I plan to use this to help document and explain how the library works. Figured I’d share here in case others find it useful.