Hi folks! I’m having trouble creating a language model from scratch in pytorch and then creating a fastai LanguageModel
from it. Here’s a notebook showing how I’m arriving at the error.
In short, I’m creating this pytorch module:
class BasicLanguageModel(nn.Module):
def __init__(self):
super().__init__()
self.i_h = nn.Embedding(nv, nh)
self.h_h = nn.Linear(nh, nh)
self.h_o = nn.Linear(nh, nv)
self.bn = nn.BatchNorm1d(nh)
self.reset()
def forward(self, x):
print("input size: ", x.size())
res = []
h = self.h
for i in range(x.shape[1]):
h = h + self.i_h(x[:,i])
h = F.relu(self.h_h(h))
res.append(self.bn(h))
print("hidden layer size: ", h.size())
print("res size: ", res[0].size())
self.h = h.detach()
res = torch.stack(res, dim=1)
print("stacked res size: ", res.size())
print("output size: ", self.h_o(res).size())
return self.h_o(res)
def reset(self):
self.h = torch.zeros(bs, nh).cuda()
My data is created by downloading the freely-available text from H.G. Wells’ War of the Worlds (hence the repo name). I download the text and write it to a one-column CSV file, and then create a TextLMDataBunch
:
data = TextLMDataBunch.from_csv('.', 'book_text.csv', text_cols=0)
Finally, I try creating and fitting a model:
learn = LanguageLearner(data, BasicLanguageModel(), metrics=accuracy)
learn.fit_one_cycle(10, max_lr=3e-2)
And I’m getting this error:
ValueError: Expected input batch_size (70) to match target batch_size (4480).
I’m not sure exactly what’s going on – feedback is appreciated!