Target -1 is out of bounds [Tabular]

I am tinkering with recommender systems with tabular data and I often encounter the following problem, here is my code:

...
path = untar_data(URLs.ML_100k)
...

dls = TabularDataLoaders.from_df(ratings, 
                                 cat_names=["user", "rating"], 
                                 cont_names=["days_since_release"],
                                 y_names="movie",
                                 procs=[Categorify, FillMissing, Normalize], 
                                 y_block=CategoryBlock(vocab=None, sort=True, add_na=True),
                                 shuffle_train=False)

dls.show_batch()

user rating days_since_release_na days_since_release movie
0 719 4 False 487.000111 282
1 586 3 False 1466.000021 281
2 627 3 False 644.000134 546
3 796 1 False 432.000036 871
4 630 3 False 315.999889 181
5 821 4 False 11219.999757 504
6 456 4 False 21159.000534 432
7 851 4 False 1733.999987 806
8 788 5 False 5812.000007 423
9 6 5 False 10956.999995 135
learn = tabular_learner(dls, [200, 100])
learn.fit_one_cycle(2, 10e-3)
epoch train_loss valid_loss time
0 7.608502 7.405924 00:00

IndexError Traceback (most recent call last)

[<ipython-input-41-b3d472e68669>](https://8ebboblbkt7-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20201016-085601-RC00_337506166#) in <module>() ----> 1 learn.fit_one_cycle(2, 10e-3)

18 frames

[/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py](https://8ebboblbkt7-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20201016-085601-RC00_337506166#) in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction) 2216 .format(input.size(0), target.size(0))) 2217 if dim == 2: -> 2218 ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index) 2219 elif dim == 4: 2220 ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)

IndexError: Target -1 is out of bounds.

I am trying to predict the next movie a user will give five stars to. Does anyone know a way to fix this ?

I am using MovieLens 100k provided by FastAI. I just added a “days since release” continuous feature.

In case anybody encounter this issue, I solved it doing the following:

ratings["movie"] = ratings["movie"].astype(str)

I was getting the same error and it turned out I was sending the wrong ‘y_block’ parameters to ‘TabularDataLoaders.from_df’. The dependent variable was continuous, therefore a regression problem, but I was sending a ‘CategoryBlock’. Changing it to RegressionBlock fixed the problem:

dls = TabularDataLoaders.\
            from_df(df, y_names=labels_column_name,
                    cont_names=cont,
                    cat_names=cat,
                    procs=procs,
                    y_block=RegressionBlock())
2 Likes