I’m having trouble with the layer_groups parameter on the Learner class. When I use the default, my model converges just fine, but when I set the parameter to a list of the model’s modules, it stubbornly refuses to make progress. Here’s a distilled example:
import numpy as np
import torch
import torch.nn.functional as F
import torch.nn as nn
from torch.utils.data import DataLoader
from torch.utils.data.dataset import TensorDataset
from fastai import *
class SimpleModel(nn.Module):
def __init__(self):
super().__init__()
self.linear1 = nn.Linear(1, 5)
self.linear2 = nn.Linear(5, 1)
def forward(self, x):
x = self.linear1(x)
x = self.linear2(x)
return x
def generate_data(size):
x = np.random.uniform(size=(size, 1))
y = x * 2.0
return torch.FloatTensor(x), torch.FloatTensor(y)
train_x, train_y = generate_data(1000)
val_x, val_y = generate_data(100)
train_ds = TensorDataset(train_x, train_y)
val_ds = TensorDataset(val_x, val_y)
train_dl = DataLoader(train_ds, batch_size=8)
val_dl = DataLoader(val_ds, batch_size=8)
data_bunch = DataBunch(train_dl, val_dl)
model = SimpleModel()
learn = Learner(data_bunch, model, loss_func=F.mse_loss)
learn.fit_one_cycle(1)
# ... converges nicely
model = SimpleModel()
learn = Learner(data_bunch, model, loss_func=F.mse_loss, layer_groups=[model.linear1, model.linear2])
learn.fit_one_cycle(1)
# ... makes no discernible progress
The problem is the batch size. DataLoader has a default batch size of 1, and it seems that fastai stopped supporting that. I updated the code snippet above with a new batch size and it now works fine with fastai 1.0.22.
Hello @sgugger, I was using fastai2 for one of my projects and wanted to make sure whether the layer_groups have been applied successfully. With fastai-v1, I believe Learner had a layer_groups attribute which we could use for debugging purpose. I found no equivalent in fastai2, let me know if I’m missing something and how could I do the same with fastai2.
In v2, you need to provide a splitter to split your model in several parameter groups. Look at the notebook vision.learner or text.learner to see some examples.
splitter is the way we apply splits but layer_groups used to be kind of sanity check whether splits have been applied correctly. I was talking about this parameter from v1.
The parameter groups are directly inside the optimizer now, so if you want to check them you need to create one (learn.create_opt()) then check learn.opt.pgs.