I’ve built an autoencoder model that applies structured data principles to the porto segurno winning entry method and I’m trying to convert it now into the head/base format of fast.ai so that I can use the base as a starting point for recommendation after it’s been trained.
I’ve combined the head and base and I’ve written a corresponding Model wrapper class but I’m unclear of what I should include and what I should leave out from the examples.
It seems like the get_layer_groups is registering parameters with the layer optimizer, and I also see it used in freezing and cutting so I suspect that it’s used to refer to different parts of the model for differential learning rates, freezing and cutting, but I wanted to get the input of someone who knows the library to explain what the role of layer groups is, where the cutoffs should be, and what I should be including/leaving out. For now I’ve used:
m = self.model
return [m.base.embs, [m.base.bn]+children(m.base.lins)+children(m.base.bns),
get_layer_groups is the only thing you need to define in your Model class. It returns all the layers (modules) that you want to optimize - fastai will grab the params from them where requires_grad=True. More specifically, you return a list of lists of layers. Each element of the outer list is considered a “layer group”. Each layer group gets its own learning rate and weight decay.
If the list you return contains a single item (which would be a list of all layers) then you’ll only have one layer group, and can’t do discriminative learning rates. At the other extreme, you could return a separate item for every layer.
Layer groups is a fairly straightforward concept. It’s just a list of the different weights of your model that you want to be able to interact with independently. If you don’t want to do anything fancy like freezing the model head and training the base you can create a very simple layer group that has only one element. There’s usually a straightforward way to break it up though.
lgs = list(split_by_idxs(children(self.model.rn), [lr_cut]))
return lgs + [children(self.model.features)[1:]]
This is inside get_layers_Groups
any reason why do we have 1: there, if we dont give any index just use all features what would be the difference…
I am struggling with some other (allegedly) missing classes/methods.
For example the function ‘to_gpu()’. I know it should also be in ‘core.py’, but it is not there.
I will be greatful if someone can share this function.
Try watching this. Thereafter, you can find anything in fastai. Well, almost everything.
also in core.py
USE_GPU = torch.cuda.is_available()
def to_gpu(x, *args, **kwargs):
'''puts pytorch variable to gpu, if cuda is available and USE_GPU is set to true. '''
return x.cuda(*args, **kwargs) if USE_GPU else x