create a model for MovieLens that works with cross-entropy loss
nn.CrossEntropyLoss needs targets with dtype: torch.int64
shape: [batch_size] and range of values: from 0 to n-1
creating dataloaders through - CollabDataLoaders.from_df()
produces targets as tensor of shape [batch_size, 1], dtype: torch.int8(though in pandas df it was int64!), and keeps a range of values unaffected: from 1 to 5
How can I tell this method .from_df() to process my
target values in required fashion?
Also, I tried using TabularPandas(df, [Categorify, FillMissing, Normalize], ['user', 'movie'], y_names="rating", y_block=CategoryBlock), but dls.one_batch() made from that produced 3 tensors with shapes [batch_size, 2], [batch_size, 0](!?), [batch_size, 1]. Didn’t help much, but target values were from 0 to n-1, at least.
thanks for answering, but suggested code doesn’t change that _, y = dls.one_batch() y.shape is [batch_size, 1]
I need y.shape to be [batch_size] for nn.CrossEntropyLoss
while I still don’t know why creating TabularCollab for categorization like this: cat_names = ['user', 'movie'] splits = RandomSplitter()(range_of(ratings)) to = TabularCollab(ratings, [Categorify], cat_names, y_names=['rating'], y_block=CategoryBlock, splits=splits, reduce_memory=False) dls = to.dataloaders(path=path)
- produces y.shape as [batch_size, 1] and not [batch_size]
I realized that I could just modify y.shape in custom loss function, like that: def loss_function(inp, target): return F.cross_entropy(inp, target.squeeze(1).long())
and then use this function in Learner: learn = Learner(dls, model, loss_func=loss_function)
Could you share notebook with your solution?
Did it actually learn to distinguish categories or just passed training cycles?
Could you take a look at my notebook here, may be advise something?
The very strange thing about this is that preds[0].sum() do not sum to 1. I just uploaded your notebook on my VM and it shows the same.
In chapter 5 - Pet Breeds the task was pretty much the same and there we had preds[0].sum() == 1. I mean, the neural net doesnt care if its images on the input or embedding vectors, its just numbers. I’m thinking, i did something wrong in implementation.
@muellerzr perhaps you could comment on this, please?