Referring to your post here, does the Fastai Learner expect a certain signature for the model? The model you integrated has no parameters passed into it when it’s initialized. Does Fastai expect this (no parameters) or is there a way to pass in some parameters to the model?
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
As the article mentioned, fastai expects an object, or the class instance. So you need to make an instance of your model beforehand. As a result it can have any number of parameters, as all fastai really “uses” is the forward function during training
One more question. Does the fastai Learner expect for the forward() function to only have one input parameter? I am getting an error which appears to be pointing to the lack of the second argument.
So here is what I did thanks to the magical power of Callbacks
Since teacher/student expects the y’s to be attached, we do the following:
# post building `DataLoaders`
learn.dls.train.n_inp = 2
class TeacherForcingCallback(Callback):
"""
Callback that sends the y's to the model too
"""
def before_batch(self):
x,y = self.x
self.learn.yb = (y.unsqueeze(0))
learn = Learner(dls, model, loss_func=criterion, cbs=[TeacherForcingCallback()])
You can also get even fancier and set your teacher_forcing_ratio and simply override the batch (self.xb) to include it:
class TeacherForcingCallback(Callback):
"Callback that sends the y's to the model too"
def __init__(self, teacher_forcing_ratio=0.5):
self.teacher_forcing_ratio = teacher_forcing_ratio
def before_batch(self):
x,y = self.x
self.xb = (x,y,self.teacher_forcing_ratio)
self.learn.yb = (y.unsqueeze(0))
Callbacks are a whole different ballgame so sorry if that might confuse you some @goralpl
Thanks for information on the callbacks. I’ll have to read up on them. I’m trying to implement what you suggested but it looks like I’m not able to set that attribute.
I created an end-to-end example of the seq2seq use case with a toy dataset. Can you please take a look? I’m sure other folks will benefit from this discussion and I’m happy to share these notebooks with anyone who cares.
dls.train.n_inp=2
class TeacherForcingCallback(Callback):
"""
Callback that sends the y's to the model too
"""
def before_batch(self):
x,y = self.x
self.learn.yb = (y.unsqueeze(0))
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-11-b9d418192d2f> in <module>
2 # learn.dls.train.n_inp = 2
3
----> 4 dls.train.n_inp=2
5
6 class TeacherForcingCallback(Callback):
AttributeError: can't set attribute
There’s a few issues with the approach here. Technically given what we’re doing our Callback can be simplified further:
class TeacherForcingCallback(Callback):
"""
Callback that sends the y's to the model too
"""
def before_batch(self):
x,y = self.x, self.y
self.learn.xb = (x,y)
However you forgot to define the outputs of your model, so we need:
input = trg[t] if teacher_force else top1
return outputs
Another issue is CrossEntropyLoss won’t particularly work here, we can see that just with:
x,y = dls.one_batch()
with torch.no_grad():
out = model(x,y)
criterion(out,y)
(I don’t particularly know enough here to recommend what to do)
Edit:
Okay @goralpl, if we use CrossEntropyLossFlat() instead as our loss function it’ll completely work (we need to flatten the outputs, hence why CELF is needed rather than just nn.CrossEntropyLoss, otherwise we’d need to do some preprocessing to get it working with the loss function