I am tinkering with recommender systems with tabular data and I often encounter the following problem, here is my code:
...
path = untar_data(URLs.ML_100k)
...
dls = TabularDataLoaders.from_df(ratings,
cat_names=["user", "rating"],
cont_names=["days_since_release"],
y_names="movie",
procs=[Categorify, FillMissing, Normalize],
y_block=CategoryBlock(vocab=None, sort=True, add_na=True),
shuffle_train=False)
dls.show_batch()
user | rating | days_since_release_na | days_since_release | movie | |
---|---|---|---|---|---|
0 | 719 | 4 | False | 487.000111 | 282 |
1 | 586 | 3 | False | 1466.000021 | 281 |
2 | 627 | 3 | False | 644.000134 | 546 |
3 | 796 | 1 | False | 432.000036 | 871 |
4 | 630 | 3 | False | 315.999889 | 181 |
5 | 821 | 4 | False | 11219.999757 | 504 |
6 | 456 | 4 | False | 21159.000534 | 432 |
7 | 851 | 4 | False | 1733.999987 | 806 |
8 | 788 | 5 | False | 5812.000007 | 423 |
9 | 6 | 5 | False | 10956.999995 | 135 |
learn = tabular_learner(dls, [200, 100])
learn.fit_one_cycle(2, 10e-3)
epoch | train_loss | valid_loss | time |
---|---|---|---|
0 | 7.608502 | 7.405924 | 00:00 |
IndexError Traceback (most recent call last)
[<ipython-input-41-b3d472e68669>](https://8ebboblbkt7-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20201016-085601-RC00_337506166#) in <module>() ----> 1 learn.fit_one_cycle(2, 10e-3)
18 frames
[/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py](https://8ebboblbkt7-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20201016-085601-RC00_337506166#) in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction) 2216 .format(input.size(0), target.size(0))) 2217 if dim == 2: -> 2218 ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index) 2219 elif dim == 4: 2220 ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
IndexError: Target -1 is out of bounds.
I am trying to predict the next movie a user will give five stars to. Does anyone know a way to fix this ?
I am using MovieLens 100k provided by FastAI. I just added a “days since release” continuous feature.